lugia19 / elevenlabslib

Full python wrapper for the elevenlabs API.
MIT License
148 stars 27 forks source link

No sound on Linux #16

Closed arkadiy-telegin closed 1 year ago

arkadiy-telegin commented 1 year ago

For some reason streaming (generate_and_stream_audio) doesn't work on my Linux machine. I get no sound at all. I installed all the dependencies mentioned in the README.

Here is the debug trace:

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.elevenlabs.io:443
DEBUG:urllib3.connectionpool:https://api.elevenlabs.io:443 "GET /v1/voices HTTP/1.1" 200 4815
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.elevenlabs.io:443
DEBUG:urllib3.connectionpool:https://api.elevenlabs.io:443 "GET /v1/user/subscription HTTP/1.1" 200 484
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.elevenlabs.io:443
DEBUG:urllib3.connectionpool:https://api.elevenlabs.io:443 "GET /v1/voices HTTP/1.1" 200 4815
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): api.elevenlabs.io:443

DEBUG:urllib3.connectionpool:https://api.elevenlabs.io:443 "POST /v1/text-to-speech/21m00Tcm4TlvDq8ikWAM/stream?optimize_streaming_latency=0 HTTP/1.1" 200 None
Complete response received: The capital of Kenya is Nairobi.
DEBUG:root:Starting iter...
DEBUG:root:Waiting for header event...
Press Enter once playback is finished.DEBUG:root:Writing weirdly sized chunk (3749)...
DEBUG:root:headerReady not set, setting it...
DEBUG:root:Header maybe ready?
DEBUG:root:File created (208 bytes read).
DEBUG:root:HeaderReady is set, waiting for the soundfile...
DEBUG:root:Writing weirdly sized chunk (3821)...
DEBUG:root:Write head move: 3821
DEBUG:root:Raise available data event - 7362 bytes available
DEBUG:root:HeaderReady is set, waiting for the soundfile...
DEBUG:root:Writing weirdly sized chunk (3816)...
DEBUG:root:Write head move: 3816
DEBUG:root:Raise available data event - 11178 bytes available
DEBUG:root:HeaderReady is set, waiting for the soundfile...
DEBUG:root:Write head move: 4096
DEBUG:root:Raise available data event - 15274 bytes available
DEBUG:root:HeaderReady is set, waiting for the soundfile...
DEBUG:root:Writing weirdly sized chunk (818)...
DEBUG:root:Write head move: 818
DEBUG:root:Raise available data event - 16092 bytes available
DEBUG:root:Starting playback...
DEBUG:root:Missing data but download isn't over. What the fuck?
DEBUG:root:Expected 8192 bytes, got back 8192
DEBUG:root:Read bytes: 8192

[x9] DEBUG:root:Putting 8192 bytes in queue.
DEBUG:root:Expected 8192 bytes, got back 8192
DEBUG:root:Read bytes: 8192

DEBUG:root:Putting 8192 bytes in queue.
DEBUG:root:Expected 8192 bytes, got back 1132
DEBUG:root:Insufficient data read.
DEBUG:root:We're not at the end of the file. Check if we're out of frames.
DEBUG:root:Recreating soundfile...
DEBUG:root:Frame counter was outdated.
DEBUG:root:Now read 8192 bytes. I sure hope that number isn't zero.
DEBUG:root:Read bytes: 8192

[x29] DEBUG:root:Putting 8192 bytes in queue.
DEBUG:root:Expected 8192 bytes, got back 8192
DEBUG:root:Read bytes: 8192

DEBUG:root:Putting 8192 bytes in queue.
DEBUG:root:Expected 8192 bytes, got back 8192
DEBUG:root:Marking no available blocks...
DEBUG:root:Read bytes: 8192

DEBUG:root:Putting 8192 bytes in queue.
DEBUG:root:Download finished - 16300.
DEBUG:root:Expected 8192 bytes, got back 8192
DEBUG:root:Read bytes: 8192

DEBUG:root:Putting 8192 bytes in queue.
DEBUG:root:Expected 8192 bytes, got back 8192
DEBUG:root:Read bytes: 8192

DEBUG:root:Putting 8192 bytes in queue.
DEBUG:root:Mismatch in the number of frames read.
DEBUG:root:This only seems to be an issue when it happens with files that have ID3v2 tags.
DEBUG:root:Ignore it and return empty.
DEBUG:root:Read bytes: 0

DEBUG:root:Got back less data than expected, check if we're at the end...
DEBUG:root:We're at the end.
DEBUG:root:While loop done.
DEBUG:root:Finished playing audio:The capital of Kenya is Nairobi.
DEBUG:root:False
DEBUG:root:Stream done.
lugia19 commented 1 year ago

I've been trying to replicate the issue and it's just not happening. What I am getting are some random ALSA errors declaring the device busy.

Could you elaborate on what your environment is? OS version, audio API, that sort of thing.

lugia19 commented 1 year ago

Bloody hell that took some work. Okay, I think I've got it. Essentially, the issue comes down to ALSA and the behavior of its "default" audio device.

If nothing else is playing, it will work correctly the first time you try to play back audio. On subsequent runs, the audio device used for the playback becomes unavailable (I noticed this by listing all audio devices every time).

Normally, if you don't specify a deviceID, sounddevice (the library I use for the playback) will just pick the default output device, which in this case is the ALSA "default" device.

The problem arises when the REAL audio device is marked as busy by ALSA. It seems like the "default" device picks whatever other audio device isn't marked as busy, resulting in the second playback technically working (and seemingly being on the same device as the first one) but outputting to another device (likely one that is disconnected, hence the lack of audio).

I'll do some digging and try to figure out how to fix this.

arkadiy-telegin commented 1 year ago

Thanks! Really appreciate it!

lugia19 commented 1 year ago

Some debugging later, it appears I was (partially) correct. That does seem to be the issue, but the generate_play_audio function has no issues with it, even down to playing two simultaneous audios.

I believe the issue comes down to the fact that the stream function uses sounddevice's RawOutputStream, whereas the non-streaming one uses OutputStream. I'll see if I can't switch to the latter for the stream as well, but I don't think so.

lugia19 commented 1 year ago

I am genuinely at a loss. I'm going to try and isolate the issue down into a minimum reproducible code sample, then I'll probably make an issue on the sounddevice repo.

lugia19 commented 1 year ago

I believe I've actually figured out what the issue is. I still think it's related to sounddevice, but not in the way I initially thought. I'll do some more testing tomorrow and write up what the bug was if I actually manage to figure out a fix.

lugia19 commented 1 year ago

Turns out this was all actually a race condition that had gone completely undetected as it never happens on windows.

The thread that pulls sound data from the queue was actually running BEFORE the queue ever got filled, and triggering an error. This only happens on linux for some reason, maybe some difference in the thread scheduling? Who knows.

Bug is now fixed in release 0.8.1

arkadiy-telegin commented 1 year ago

Big thanks! It's working now!