Closed C0rn3j closed 2 weeks ago
Hi @C0rn3j
You mean download when in the TTS generator? As that is the only bit that uses the en-core-web-md. Though I cannot see any way that setting the narrator enabled would have any impact on this. The narrator_enabled is nothing more than a system variable that applies specifically to the text-generation-webui TG-webui
extension/interface https://github.com/oobabooga/text-generation-webui
Its not used for start-up and its not used by the TTS generator.
Beyond that, the only narrator enabled code is from the API requests on this line of code https://github.com/erew123/alltalk_tts/blob/main/tts_server.py#L1059 (Im assuming you are talking AllTalk v1) and has no other impacts throughout AllTalk or any impacts/interactions with the TTS generator.
The en-core-web-md (spacy) is only used by the TTS generator for analysis of the generated text vs what Whisper can read back. Its imported in tts_diff https://github.com/erew123/alltalk_tts/blob/main/system/tts_diff/tts_diff.py#L89 when analysis is called.
So I can see no way narrator_enabled would impact start-up bar a damaged JSON file.
As for the en-core-web-md, are you saying this is an issue downloading when using the TTS Generator?
Thanks
I am using v1 (with the changes in the PR I sent), setting "narrator_enabled": true,
on a clean Docker container will result in failure to startup properly, silently with no errors.
Setting "narrator_enabled": false,
as is default, starting, trying to run Analyze TTS (which will download the models/deps) and then switching it to true
and restarting is fine.
You mean download when in the TTS generator? As that is the only bit that uses the en-core-web-md.
Yep, it downloads there, just fine when the UI is already running.
Unless you are running the environment with Text-generation-webui, there is no benefit/need to have narrator_enabled: true,
its literally the flag/setting for this checkbox https://raw.githubusercontent.com/erew123/screenshots/main/textgensettings.jpg "Narrator Enabled" in the TGWUI extension.
Are you running TGWUI in the same docker environment?
The only code that would touch that variable, would be TGWUI pulling in the def ui
during its start-up as an extentison:
https://github.com/erew123/alltalk_tts/blob/main/script.py#L855
and then if that gradio code is running in TGWUI, it would update the radio button accordingly:
https://github.com/erew123/alltalk_tts/blob/main/script.py#L956
So I am absolutely baffled why this would have any effect on start-up in any way, as that portion of code wouldnt even be touched, if you arent using AllTalk as a TGWUI extension, but even then it shouldnt matter.
Is there literally nothing shown on screen when that setting is flagged true, that shows any start-up progress? Do you want to try uninstalling Gradio on the docker build and see if that has any impact.
Unless you are running the environment with Text-generation-webui, there is no benefit/need to have narrator_enabled: true, its literally the flag/setting for this checkbox https://raw.githubusercontent.com/erew123/screenshots/main/textgensettings.jpg "Narrator Enabled" in the TGWUI extension.
Aha, I had no idea and thought I need it for the API.
Are you running TGWUI in the same docker environment?
Not running it at all.
Is there literally nothing shown on screen when that setting is flagged true, that shows any start-up progress?
When I tried docker exec into the container and starting up uvicorn(or whatever it's named) and then the script.py, the script.py simply ended up output on a line that said "Detected docker env, waiting."
I have peeked at the script.py but did not quickly see how it should have continued.
Hi @C0rn3j I didnt build the V1 docker version. In V1 launch.sh is starting the script https://github.com/erew123/alltalk_tts/blob/main/launch.sh which is where the pause occurs. Why this was set this way by the person whom did it, I dont know.
It seems to only download the necessary dependencies when triggered from the Web UI, but if one provides a config where narrator is already enabled (in my case I am updating the deps at the moment and constantly rebuilding), it will silently fail to launch.
There should be some detection in the UI launch to either download the dep outright if narrator is enabled, or not to load it if the dep is missing and it is enabled.
Another option would be to add the
en-core-web-md==3.7.1
dependency to requirements directly instead of only downloading it on demand.