erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
GNU Affero General Public License v3.0
864 stars 98 forks source link

API Issues #316

Closed imnotliveyet closed 3 weeks ago

imnotliveyet commented 3 weeks ago

🔴 If you have installed AllTalk in a custom Python environment, I will only be able to provide limited assistance/support. AllTalk draws on a variety of scripts and libraries that are not written or managed by myself, and they may fail, error or give strange results in custom built python environments.

🔴 Please generate a diagnostics report and upload the "diagnostics.log" as this helps me understand your configuration.

https://github.com/erew123/alltalk_tts/tree/main?#-how-to-make-a-diagnostics-report-file

Describe the bug API Refuses to connect, I have verified that it is running and can generate using the AllTalk TTS Generator link, but not with python requests or with the curl example

To Reproduce Steps to reproduce the behaviour: Step one: fresh install AllTalk-TTS Step two: run start_alltalk.bat Step three: attempt curl

Screenshots image image No clue why it's trying to access an unmounted usb.

Text/logs $ curl -X POST "http://127.0.0.1:7851/api/tts-generate" -d "text_input=All of this is text spoken by the character. This is text not inside quotes, though that doesnt matter in the slightest" -d "text_filtering=standard" -d "character_voice_gen=female_01.wav" -d "narrator_enabled=false" -d "narrator_voice_gen=male_01.wav" -d "text_not_inside=character" -d "language=en" -d "output_file_name=myoutputfile" -d "output_file_timestamp=true" -d "autoplay=true" -d "autoplay_volume=0.8" curl: (7) Failed to connect to 127.0.0.1 port 7851 after 0 ms: Connection refused

Desktop (please complete the following information): AllTalk was updated: N/A Custom Python environment: no Text-generation-webUI was updated: N/A

Additional context I'm trying to use python requests, which works fine with ollama api, but not alltalk for some reason. I have no clue why it is refusing.

erew123 commented 3 weeks ago

Hi @imnotliveyet

I would think this will be a firewalling or antivirus issue of some kind. I can only give you a bunch of things to look at and see what response you get from them.

From the windows command prompt (NOT Powershell)

curl http://localhost:7680 (test to see if curl can access Windows Update)

image


netstat -ano | findstr :7851 (With AllTalk running, check to see if the Python process ID (PID) is running on port 7851)

image


curl -X GET "http://127.0.0.1:7851/api/ready" (With AllTalk running, check if you can access the Ready API)

image


curl -X POST "http://127.0.0.1:7851/api/tts-generate" -d "text_input=All of this is text spoken by the character. This is text not inside quotes, though that doesnt matter in the slightest" -d "text_filtering=standard" -d "character_voice_gen=female_01.wav" -d "narrator_enabled=false" -d "narrator_voice_gen=male_01.wav" -d "text_not_inside=character" -d "language=en" -d "output_file_name=myoutputfile" -d "output_file_timestamp=true" -d "autoplay=true" -d "autoplay_volume=0.8" (If you can access the ready prompt, run the curl command)

image


Tests that can be performed in Powershell

Get-NetFirewallRule | Where-Object {$_.DisplayName -like "*PowerShell*"} (Check Powershell isn't in the Windows Firewall, it shouldn't be).

Test-NetConnection -ComputerName localhost -Port 7851 (Perform a loopback test with Powershell).

image


Not sure if you are running this in WSL, but if you are, you need to setup access for that https://learn.microsoft.com/en-us/windows/wsl/networking

Id take a look at any Antivirus or similar software you are running, as with 7851 being a non-standard port, it may well interfere with any firewalling it is doing. You could temporarily disable your antivirus firewalling (if running a 3rd party antivirus) to check the effect/impact. Most have a "disable for X minutes" type option.

erew123 commented 3 weeks ago

@imnotliveyet Just to be clear, there is nothing AllTalk can do to send a "Connection refused", so this is something before AllTalk that is the issue e.g. not running the command on the same machine, antivirus/firewall, a networking issue etc

spacedjames commented 3 weeks ago

All these tests worked fine for me, everytime I access the api from ST I get a "expected .wav file, only returned text" or something a long those lines

imnotliveyet commented 3 weeks ago

Sending the curl from cmd works as intended, but not from powershell, wsl, or a python request.

erew123 commented 3 weeks ago

@spacedjames Please ensure you have correct extension version installed for the version of AllTalk you are running. If you are on V2 of AllTalk, please look in the System directory and the SillyTavern folder for instructions.

@imnotliveyet If you can reach AllTalk via the CMD prompt and CURL there, then that shows AllTalk is working as expected. As mentioned, there is no features within the AllTalk code to block or restrict access, though network issues will still restrict access.

First off, I would suggest using V2 of AllTalk as that launches on IP address 0.0.0.0 and binds to all possible network adapters, whereas V1 of AllTalk only binds to 127.0.0.1 (the local loopback address, meaning it is not network accessible).

Beyond that, WSL uses NAT, as mentioned in the above linked Microsoft article. NAT will more than likely stop you routing to 127.0.0.1 as its a local loopback address and TCP/IP wont see a natted connection as being the local loopback address., as, in effect the WSL machine layer is its own computer, so will have its own 127.0.0.1 separate to that of the host OS.

As for why powershell wouldn't have access, I can only think firewalling, as above.

As for Python requests, Im not sure exactly what you mean by that. Have you written some Python code?

Either way I would suggest using v2 of AllTalk, due to binding on 0.0.0.0 and there are also additional diagnostics built in, allowing you to see clearer when the API was accessed and what information reached it.

imnotliveyet commented 3 weeks ago

That's strange, as ollama works fine when running it in wsl

erew123 commented 3 weeks ago

As mentioned, AllTalk V1 binds to 127.0.0.1. Ollama binds to 0.0.0.0. Reference here: https://github.com/ollama/ollama/blob/main/docs/faq.md#how-can-i-allow-additional-web-origins-to-access-ollama

In the windows TCP/IP stack, these access different layers of the stack and route differently. This routing difference is mentioned in the WSL Microsoft document I linked above. Reference here: https://learn.microsoft.com/en-us/windows/wsl/networking#connecting-via-remote-ip-addresses

AllTalk version 1 binds to 127.0.0.1. Reference here: https://github.com/erew123/alltalk_tts/blob/main/confignew.json#L8 and changing that to 0.0.0.0 wont work, for a multitude of reasons, which is why V2 was changed.

AllTalk version 2 will bind to 0.0.0.0. Reference here: https://github.com/erew123/alltalk_tts/blob/alltalkbeta/tts_server.py#L1687

The reason for moving to 0.0.0.0 is mentioned in the Breaking changes article I posted https://github.com/erew123/alltalk_tts/issues/166 under Why am I doing this? which explains the change made in V2 due to routing issues with 127.0.0.1