Open FILLITUP opened 2 days ago
Hm maybe host resolution adds latency. Can you change the client from localhost to 127.0.0.1 and see if it gets better?
I tried that 1st thing, same outcome. I have started assuming that the QT framework may simply be quicker at handling audio stream.?.
In the first lines of server.py please set
DEBUG_LOGGING = True
The server should now let you know how long RealtimeTTS needs to synthesize the first audio chunk:
INFO:root:Audio stream start, latency to first chunk: 0.23s
With this updated code the client also logs timings:
(venv) C:\Dev\Audio\RealtimeTTS\RealtimeTTS\example_fast_api>python client.py
Time to first token: 0.24388694763183594
The difference tells you how much time is spent within the fastapi request processing. (so here on my system - with server and client both local - fastapi and network only add ~0.01s to latency)
Can you verify this on your system? Maybe we can then see better where the time is spent then.
Thank you. Your suggestions were immensely helpful. Notes: Network latency is similar to yours. ~.01 sec. as measured by your updated client.py script. I suspect that the network latency is similar for the browser as client.
Your updated client.py is clearly faster than the original; this is more clearly evident when you use a text string a lot longer than "hello world '.
When using Edge browser as client, however, with longer text strings, the time it takes to actually hear the audio becomes larger by 2-3 seconds as compared to the same string in client.py. The latencies, as reflected in the logs, of the 2 are essentially the same. The actual latencies are clearly different: Client.py audio is heard almost immediately after audio stream starts, whereas, for browser, the audio heard is noticeably delayed after audio stream starts.
Which code do you use to play out over the browser?
I used server.py provided in "example_fast_api" which was also the source of the client.py, before and then after with the updated client.py. Server.py provided the url to open the EDGE browser. It just seems like the audio stream back to the browser for play is impeded in some way.?.
Greetings, Although I can speculate, it's not clear why the audio stream in pyqt6 is hearable at least 2 seconds earlier than in the Fast-Api example. The same passage of text is presented to both and the logs show very similar latency times (0.5 to 0.75). The answer might be a clue as to why I am having roughly a 3-4 second audio delay in my attempted chats via LocalEmotionalAIVoiceChat and any LM Studio loaded LLM.
I'm running a Windows 11 machine with lots of memory, CPU and GPU. Thanks in advance.