erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
GNU Affero General Public License v3.0
816 stars 91 forks source link

Headless, invisible mode #160

Closed johnbenac closed 4 months ago

johnbenac commented 4 months ago

If you want a lightweight footprint on the computer where you are running it, beyond setting the wav files to delete after a day, you can also change:

process = subprocess.Popen(["python", script_path]) to process = subprocess.run(["python", script_path], creationflags=subprocess.DETACHED_PROCESS) in script.py

And the terminal window will be closed, despite the program being running.

You could add a command line argument, or some other way to set this flag, so that the user can run it without the window showing up. With this simple change, the program can be killed from the task manager, but you might put something in that kills the process after a certain amount of time. Or perhaps unloads the model if it hasnt been used in a while, and then loads it up when someone wants to generate a TTS.

It might also be nice if there were no saved audio files on the machine running the TTS at all, ever. Would that be possible? For it just to send audio blobs, but now save files locally? Or delete a temp.wav file as soon as the audio file was sent?

johnbenac commented 4 months ago

uhh. my change did not work as intended. Nevermind. I will post an update if I figure this out.

erew123 commented 4 months ago

Hi @johnbenac

Re It might also be nice if there were no saved audio files on the machine running the TTS at all, ever. Would that be possible? For it just to send audio blobs, but now save files locally? Or delete a temp.wav file as soon as the audio file was sent?

The streaming API will not create/store wav files locally. The other API, that uses the narrator has to store them locally. Its actually a feature of the TTS service and no way to change it. The narrator also relies on the ability to create multiple wav files then merge them into 1x wav file at the end of that process.

Thanks