oobabooga / text-generation-webui

A Gradio web UI for Large Language Models.
GNU Affero General Public License v3.0
39.88k stars 5.23k forks source link

More TTS!! #885

Closed da3dsoul closed 9 months ago

da3dsoul commented 1 year ago

Each TTS system has pros and cons. I'm going to build plugins for each of these to see how they perform, and since they'll be done, I may as well PR them.

https://github.com/coqui-ai/TTS - very good samples https://github.com/neonbjb/tortoise-tts - also very good samples https://github.com/CorentinJ/Real-Time-Voice-Cloning - custom voices? looks neat https://github.com/rhasspy/larynx - very low-spec compatible, acceptable quality https://github.com/TensorSpeech/TensorFlowTTS - very configurable from what I see

I care less about speed and more about quality, while some people might just want it to run with as little impact as possible.

Umm... we may need to rethink the UI layout for some things if all of these are actually accepted

ksylvan commented 1 year ago

I'm very interested in this. What's the status of this work?

FerLuisxd commented 1 year ago

Could also add the edge tts that is free: https://pypi.org/project/edge-tts/

Cloudwalker2k3 commented 1 year ago

Ok, I added some descriptions to the model dropdown and made note of which ones need espeak-ng for Coqui. All of the English models are added. Due to how many models there are, a different kind of UI is necessary for all of the different languages.

Now matter how I try to get TTS by MRQ working with the web ui, there is always some failure. It can't find the TTS Processor. Unable to find Modules or even Tortoise.

da3dsoul, is there a process you use to get this functionality working with the oobabooga web ui?

da3dsoul commented 1 year ago

da3dsoul, is there a process you use to get this functionality working with the oobabooga web ui?

I have an open PR, which I assume you are trying to use. As part of that PR, tts_preprocessor was moved. Check the changes tab. There's also an install script. You may need to install the original Tortoise and grab the voice models from there. Tortoise_Fast has that issue iirc

Cloudwalker2k3 commented 1 year ago

da3dsoul, is there a process you use to get this functionality working with the oobabooga web ui?

I have an open PR, which I assume you are trying to use. As part of that PR, tts_preprocessor was moved. Check the changes tab. There's also an install script. You may need to install the original Tortoise and grab the voice models from there. Tortoise_Fast has that issue iirc

I had this error come up:

Traceback (most recent call last): File "K:\AI\Pygmalion\installer_files\env\lib\site-packages\gradio\routes.py", line 414, in run_predict output = await app.get_blocks().process_api( File "K:\AI\Pygmalion\installer_files\env\lib\site-packages\gradio\blocks.py", line 1323, in process_api result = await self.call_function( File "K:\AI\Pygmalion\installer_files\env\lib\site-packages\gradio\blocks.py", line 1051, in call_function prediction = await anyio.to_thread.run_sync( File "K:\AI\Pygmalion\installer_files\env\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "K:\AI\Pygmalion\installer_files\env\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "K:\AI\Pygmalion\installer_files\env\lib\site-packages\anyio_backends_asyncio.py", line 807, in run result = context.run(func, *args) File "K:\AI\Pygmalion\text-generation-webui\extensions\tortoise_tts\script.py", line 175, in toggle_text_in_history return chat_html_wrapper(shared.history['visible'], name1, name2, mode) TypeError: chat_html_wrapper() missing 1 required positional argument: 'style'

da3dsoul commented 1 year ago

Presumably the branch became out of date. I'll update it

Cloudwalker2k3 commented 1 year ago

Presumably the branch became out of date. I'll update it

Just checking in, are you still looking into getting Tortoise TTS fast or MRQ updated?

da3dsoul commented 1 year ago

I forgot. Hopefully I'll remember in the morning, as it is quite late

da3dsoul commented 1 year ago

Updated

Ph0rk0z commented 1 year ago

For extra fun.. apply an RVC model over TTS output. It might be faster and better to use some janky but QUICK tts like coqui and then throw a proper RVC (or even SVC) over it.

That way instead of loooong generations from tortoise you just have a very fast TTS + short pitch fixing inference but still get a "clone" that talks like the character.

Just an idea.

CRCODE22 commented 1 year ago

coqui_tts extension does not work anymore with latest text generation web ui anyone know how to fix it?

2023-09-28 22:24:41 ERROR:Failed to load the extension "coqui_tts". Traceback (most recent call last): File "K:\text-generation-webui\modules\extensions.py", line 36, in load_extensions exec(f"import extensions.{name}.script") File "", line 1, in File "K:\text-generation-webui\extensions\coqui_tts\script.py", line 7, in from modules import chat, shared, tts_preprocessor ImportError: cannot import name 'tts_preprocessor' from 'modules' (unknown location)

da3dsoul commented 1 year ago

I would not be surprised if it takes more than this, but that file is moved to the modules folder in the PR.

CRCODE22 commented 1 year ago

I fixed some of the other issues but now these remain that I have been unable to fix yet.

2023-09-29 01:29:13 ERROR:Failed to load the extension "coqui_tts". Traceback (most recent call last): File "K:\text-generation-webui\modules\extensions.py", line 36, in load_extensions exec(f"import extensions.{name}.script") File "", line 1, in File "K:\text-generation-webui\extensions\coqui_tts\script.py", line 71, in model, speaker, language = load_model() File "K:\text-generation-webui\extensions\coqui_tts\script.py", line 56, in load_model tts = TTS(params['model_id'], gpu=params['Cuda']) TypeError: 'module' object is not callable

da3dsoul commented 1 year ago

Can you gist your updated script.py and link it? I'll take a look and see if I can spot the error. Edit: you can also PR against my fork if you want to contribute to updating it

chigkim commented 1 year ago

Has anyone tried incorporating Piper? Extremely fast and decent quality! https://github.com/rhasspy/piper

Xanw1ch commented 11 months ago

I fixed some of the other issues but now these remain that I have been unable to fix yet.

2023-09-29 01:29:13 ERROR:Failed to load the extension "coqui_tts". Traceback (most recent call last): File "K:\text-generation-webui\modules\extensions.py", line 36, in load_extensions exec(f"import extensions.{name}.script") File "", line 1, in File "K:\text-generation-webui\extensions\coqui_tts\script.py", line 71, in model, speaker, language = load_model() File "K:\text-generation-webui\extensions\coqui_tts\script.py", line 56, in load_model tts = TTS(params['model_id'], gpu=params['Cuda']) TypeError: 'module' object is not callable

I had the same error. It's specifically for Coqui-AI. It appears there's another dependency text-generation-webui-main\installer_files\env\Lib\site-packages\TTS\init.py

that TTS refers to -- AFTER def load_model(): is called in script.py for the extension.

My solution was to use "as" for the import: from TTS.api import TTS as Coqui_TTS #line 10

And then call tts = Coqui_TTS(params['model_id'], gpu=params['Cuda']) #In line 56

Seems to work that way for me.

CRCODE22 commented 11 months ago

Can you gist your updated script.py and link it? I'll take a look and see if I can spot the error. Edit: you can also PR against my fork if you want to contribute to updating it

Sorry for my slow reply I just saw your reply I have forked your repository and I will try to get it updated to use the latest Oobabooga original repository since yours is outdated and see if I can make it work. It would however help if you could sync yours with the original repository that way people like me can do our best to solve problems. I would love for Coqui TTS to work with the latest OobaBooga.

CRCODE22 commented 11 months ago

I fixed some of the other issues but now these remain that I have been unable to fix yet. 2023-09-29 01:29:13 ERROR:Failed to load the extension "coqui_tts". Traceback (most recent call last): File "K:\text-generation-webui\modules\extensions.py", line 36, in load_extensions exec(f"import extensions.{name}.script") File "", line 1, in File "K:\text-generation-webui\extensions\coqui_tts\script.py", line 71, in model, speaker, language = load_model() File "K:\text-generation-webui\extensions\coqui_tts\script.py", line 56, in load_model tts = TTS(params['model_id'], gpu=params['Cuda']) TypeError: 'module' object is not callable

I had the same error. It's specifically for Coqui-AI. It appears there's another dependency text-generation-webui-main\installer_files\env\Lib\site-packages\TTS**init**.py

that TTS refers to -- AFTER def load_model(): is called in script.py for the extension.

My solution was to use "as" for the import: from TTS.api import TTS as Coqui_TTS #line 10

And then call tts = Coqui_TTS(params['model_id'], gpu=params['Cuda']) #In line 56

Seems to work that way for me.

I have made a fork I do not understand exactly what you wrote could you make the changes within my fork? That way I can see what you mean and if it works.

Thank you for replying :)

github-actions[bot] commented 9 months ago

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.