erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
GNU Affero General Public License v3.0
1.16k stars 122 forks source link

i wonder when there will be a tutorial and a discord for alltalktts #420

Closed quakeman00 closed 3 days ago

quakeman00 commented 5 days ago

there is not a single AllTalkTTS tutorial on the internet, they all go from step 1, 2, 6, 7, 8

all steps in-between are unknown and nobody refuse to adress them and ctrl+f on a page full of documentation returns 0 result, so if the problem is adressed, the person who write about it didnt use the words the problem is mentioning, ive checked the known issues, known error message and even reddit, but all i get is an error message telling me that i need to put on some winter wheels on my car or something

fairseq-0.12.4-cp311-cp311-win_amd64.whl is not a supported wheel on this platform

and there is no LIVE communication with the community, it being discord or something else like discord where people can communicate, making it even harder to talk to anyone who might know wtf is going on and get a response in a timely manner.

its my third time attempting to install it and every time i run into python related issues, because apparently most text to speech open source programs all use the same structure that just don't work, while image generation and text generation open sources things on github you pusha button and it just works, its as simple as double clicking, but this, coqui, alltalk, whisper and so on, all require to instal 21478162 things and then go from, download, now open it, missing all the steps in between where errors and vague language is used that cause new people who dont know much about how this work to be lead to nowhere.

erew123 commented 5 days ago

@quakeman00 Have you read my statement on Support and why I am restricted/cannot spend all my time supporting people?

Do you mean like these on the Wiki? https://github.com/erew123/alltalk_tts/wiki/Install-%E2%80%90-Standalone-Installation#quick-setup---video-guide

And the new version has very clear documentation on every page:

image

What exactly are you installing into/which guide are you following? If you are following the Standard installation, you shouldnt have any messages about "fairseq-0.12.4-cp311-cp311-win_amd64.whl is not a supported wheel on this platform" which would typically come up because you are installing it in a non Python 3.11 environment OR trying to install into a non-windows environment. You are welcome to post the full error up and explain what exactly you are doing and Ill help you thorough it.

quakeman00 commented 4 days ago

i am not asking only you to support literally everyone, plenty of people went through the installation and know the fixes and the lack of open LIVE community is restricting people's ability to share information to be able to support and fix each other's stuff without you needing to personally intervene if you don't have time.

as for the quick setup installation page and wiki page, still unable to run it following them to the letter, there's red everywhere in the installations and i have python 3.13 and 3.11, yet the installation still spit out python errors. i just don't know why people need to manually install all these prerequisite and manually install all the components using a console, instead of offering both this installation AND a pre compiled and pre installed zipped folder that just works when double clicking so if an installation conflict happens, the zip would just work.

erew123 commented 4 days ago

Hi @quakeman00 Please glance the over the quick intro to Python environments this may explain that to you and it a 2 minute read.

Because AllTalk is installing so many requirements for so many TTS engines, the requirements have to be carefully managed as other software may upgrade/downgrade packages causing instability, conflicts, so many unknown situations. This is unfortunately the way Python is.

The reason you don't hand out zip files is due to both the size it would need to be e.g. here is the Python environment folder that AllTalk needs/would be built for its environment by atsetup (10GB):

image

And, because of the open nature of Python code, and packages, they settled on Python PIP as the manager, which is a messy sod at times dealing with conflict resolution issues. Down the chain of AllTalk there is probably code from 1,000-2,000 other projects that PIP figures out what works with what and what to do if one of those people update/change their code because of a bug or security issue (this is called dependency resolution). In short its a complicated mess, but, it is the way Python code/scripts work.

The utility atsetup.bat or atsetup.sh will create its own custom python environment. It should just be loaded at a normal command prompt or terminal, not in a currently loaded Python virtual environment, just a plain old command prompt. e.g it should look exactly like this when you start out:

image

Over the next few days I will be updating atsetup.bat and atsetup.sh to not only log out its progress, but also carry a little bit more reporting on errors/issues it may encounter.

Youre welcome to get me a screenshot of an error code or copy/paste the text, typically this would be say the final 10-20 lines of the error.

Re Discord or similar.

I would still have an overhead to manage it at a base level. Hopefully some other people will come on board one day and want to take on that responsibility, but I just have too much on my plate at the moment. So the slower pace and one central point of contact allows me to deal with things a bit easier. However, people do respond on the Discussions forums, so you can always post there.

Let me know.

Thanks

quakeman00 commented 4 days ago

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. gruut 2.4.0 requires jsonlines~=1.2.0, but you have jsonlines 4.0.0 which is incompatible.

Requirement already satisfied: pure-eval in c:\users\dispenser\appdata\local\programs\python\python311\lib\site-packages (from stack-data->ipython->descript-audiotools@ git+https://github.com/descriptinc/audiotools->parler_tts==0.2.1->-r system\requirements\requirements_parler.txt (line 1)) (0.2.3) Faiss The system cannot find the path specified. FFmpeg The system cannot find the path specified.

There was an error installing the requirements. Have you started your Text-gen-webui Python environment with cmd_{yourOS} before running atsetup.bat? Press any key to return to the menu.

no idea what to do from here, i try to run the environment but the console closes itself automatically and ffmpeg, i already have it

erew123 commented 4 days ago

Hi @quakeman00

I think I have it figured based on the above. I am going to make some assumptions from that error which are:

1) You are wanting to use AllTalk with Text-generation webui 2) You have copied AllTalk into your ...\text-generation-webui\extensions\alltalk_tts\ folder


Installing AllTalk

Be it that the 2nd one is correct or not, Ill tell you the fix and then I will tell you the why.

C:\
├── AI
│    ├── models
│    ├── SillyTavern
│    ├── stuff
│    └── text-generation-webui     # This is where TGWUI is currently installed.
├── boot
├── Program files
├── Windows
etc.....

image

to looking like this:

C:\
├── AI
│    ├── alltalk_tts    # This is our newly cloned alltalk_tts folder NOT inside another applications folder
│    ├── models
│    ├── SillyTavern
│    ├── stuff
│    └── text-generation-webui
├── boot
├── Program files
├── Windows
etc.....

Once you are have cloned the alltalk_tts folder and then in your command prompt moved into that folder, you will run atsetup.bat and you will choose Standalone Installation

image

And then Install AllTalk as a Standalone Application

image

This will create a full custom python environment and do all the things necessary. It should run through to the end without any problems and you should get a screen at the end of that installation which says something like this:

image

That will be AllTalk installed as a standalone.

When you want to use AllTalk, you will go into its folder and run "starts_alltalk.bat`


Getting AllTalk working with Text-generation-webui

Now that AllTalk is running, I am assuming you want it working with Text-generation-webui? To do that you will follow the installation instructions for Text‐generation‐webui Remote Extension

This will install a small extension into TGWUI, that has all the functionality etc.


What has the issue been

The atsetup, when running the "Text Generation Webui" install, and not the "Standalone Installation", the TGWUI installation will NOT install a custom Python environment, because you would be installing into TGWUI's custom Python environment.

Because TGWUI's custom Python environment's package requirements have moved its version requirements so high up, Installing Alltalk's main installation directly into its Python environment would damage it.

I reference this issue https://github.com/erew123/alltalk_tts/wiki/Install-%E2%80%90-Text%E2%80%90generation%E2%80%90webui-Installation#read-before-installing-into-tgwuis-python-environment

Here https://github.com/erew123/alltalk_tts/wiki/Install-%E2%80%90-Text%E2%80%90generation%E2%80%90webui-Installation#quick-setup---text-generation-webui-installations

And here https://github.com/erew123/alltalk_tts/issues/377


Hopefully that will answer your question and get you up and running. If so, please feel free to close the ticket.

Once you are up and running, you can check things like the quick start guide to help you though any initial questions https://github.com/erew123/alltalk_tts/wiki

Thanks

quakeman00 commented 4 days ago

still dosent work, the first part of stand alone works fine, that never was the issue but the text web ui does not, and it keep getting the same error message again

erew123 commented 4 days ago

This error message here???

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
gruut 2.4.0 requires jsonlines~=1.2.0, but you have jsonlines 4.0.0 which is incompatible.

Requirement already satisfied: pure-eval in c:\users\dispenser\appdata\local\programs\python\python311\lib\site-packages (from stack-data->ipython->descript-audiotools@ git+https://github.com/descriptinc/audiotools->parler_tts==0.2.1->-r system\requirements\requirements_parler.txt (line 1)) (0.2.3)
** Faiss **
The system cannot find the path specified.
** FFmpeg **
The system cannot find the path specified.

   There was an error installing the requirements.
   Have you started your Text-gen-webui Python environment
   with cmd_{yourOS} before running atsetup.bat?
   Press any key to return to the menu.

where at the bottom of the error message it says:

with cmd_{yourOS} before running atsetup.bat?

This is the error message you are getting still?

erew123 commented 4 days ago

And you are telling me that AllTalk the standalone installation has worked? And you can now start AllTalk? so you get this screen?

image

quakeman00 commented 4 days ago

image

yep, always worked fine

its just the installation of web ui that doesn't work

and the (start_environment.bat) doesn't work either, the console closes itself

erew123 commented 4 days ago

Ok. At no point should you be using that menu option:

image

I assume you DID remove the AllTalk folder that you had previously created? (from my instructions above):

image

And then you install the TGWUI remote extension (from my instructions above):

image

Please follow these instructions TGWUI remote extension (from my instructions above):

image

Thanks

quakeman00 commented 4 days ago

this is it, this is what im talking about the missing steps, i did 2 new fresh install to make sure and now im being told that i should not install the first option in atsetup but to move files into an existing webgeneration ui folder, with a bunch of other files in it... when does this folder spawn? how do i make / get it?, the instruction from the link you used also assume the folder exist somewhere

image

???? theres so many steps missing, i got all talk stand alone installed and working, but the "move files into this existing folder ... somewhere..." is where im lost

erew123 commented 4 days ago

I dont understand what you mean by where does it spawn. The instructions in for the TGWUI Remote extension tell you to create that folder:

image

In the TGWUI Instructions it EXPLICITLY says what to do and links to the relevant sections: https://github.com/erew123/alltalk_tts/wiki/Install-%E2%80%90-Text%E2%80%90generation%E2%80%90webui-Installation#read-before-installing-into-tgwuis-python-environment

image

It repeats that instruction, on the same page to draw your attention to it:

image

Nowhere in the instructions for Standalone installation OR the TGWUI Remote Extension does it tell you to go back into the installation utility atsetup and go to the TGWUI menu.

Do you think I need to explicitly caveat that?

quakeman00 commented 4 days ago

this one , where did this one came from? image

erew123 commented 4 days ago

If you are using Text-generation-webui from Oobabooga https://github.com/oobabooga/text-generation-webui that is the standard folder name it will create when you follow their installation instructions. https://github.com/oobabooga/text-generation-webui?tab=readme-ov-file#how-to-install

erew123 commented 4 days ago

If you are NOT using Text-generation-webui, then you wouldn't follow the instructions for that.

quakeman00 commented 4 days ago

was looking to use with with silly tavern and on its own to try to generate voice stuff, and was told that this was it on its own, i didn't know i needed to use another's web ui system

erew123 commented 4 days ago

You don't need another one! If you just want it to use it with SillyTavern, you just want to install it as a Standalone.

https://github.com/erew123/alltalk_tts/wiki/Install-%E2%80%90-Standalone-Installation

TGWUI is a completely different thing. No-where in the Standalone instructions does it say to install the TGWUI requirements, or install it in Google Colab, or Docker etc.

This is an Options list that you pick the option you want/need to match your requirements:

image

If all you want it AllTalk to work with SillyTavern and you have whatever backend loader for your AI model (OpenAI, Claude, Kobold, TGWUI, whatever) you can just install AllTalk as a Standalone install and it will speak to SillyTavern which and SillyTavern will speak to your AI model however you set that up.

AllTalk is used across 50+ different projects on the internet, its not just for SillyTavern, so I cannot caveat every individual persons custom scenario on a - If you have XX and Also YY and want ZZ then you need to install ABC way. Unfortunately that is far too complicated for me to cover every scenario for every person.

I will take on board what you say though and what you found difficult and think about how to re-visit it. Though I did check the Sillytavern documentation and it does state the choice in their documentation https://docs.sillytavern.app/extensions/alltalk/ which is explicit instructions for SillyTavern and links to the correct AllTalk instructions as you need.

image

Are you at least up and running now?

quakeman00 commented 4 days ago

at work right now, ill try when i return in 3 hours

quakeman00 commented 3 days ago

image i got obbabooga web ui to open, but the instructions for copying the files into a new all talk folder lead me with this error message

quakeman00 commented 3 days ago

i have voice files i found, pth and index files, where do i put them / or the folder of the char? or does it generate a voice using the sample every time?

erew123 commented 3 days ago

Please go to this link and click the download (as shown in the image below): https://github.com/erew123/alltalk_tts/blob/alltalkbeta/system/TGWUI_Extension/script.py

image

You want to save that over the top (replacing) the other script.py file in c:\Users\DISPENSER\Desktop\genstuff_00\text-generaetion-webui-main\extensions\alltalk_tts\ so replace the scipt.py in that location with the one you just downloaded.

That will solve your first problem.

As for your second problem, have you looked at the quick start guide?: https://github.com/erew123/alltalk_tts/wiki/AllTalk-V2-QuickStart-Guide

Each TTS engine has various locations for different things and there is help throughout the whole of the Gradio interface. Typically most voices are stored in the voices folder.

As mentioned in the quickstart guide, you can press Ctrl+left click on the clickable links to to open it:

image

Each TTS engine has its own help page/guide e.g.

image

And pretty much every single page in the interface has an expandable help section detailing what is on that page and how to use it:

image

Beyond that, the Wiki is full of lots of other information.

quakeman00 commented 3 days ago

OH, so gradio IS THE web ui that came with it, thank you, now this make sence, thank you very much, this is what i was looking for.

quakeman00 commented 3 days ago

once again hitting a wall of missing info, once again this is where a live community that know what's going on would be helpful, i did the fine tuning test, pass everything except for base model, trying to run the xtts, i downloaded it from hugging face and put it in the xtts folder, seems like its the wrong file? (model.pth) https://huggingface.co/coqui/XTTS-v2/tree/main

image

i got all these files from hugging face

image

and yes i restarted alltalk to make sure it wasnt an issue of not reading it

erew123 commented 3 days ago

All TTS engines are managed in the central interface. You should use the download tab for each engine e.g.

image

quakeman00 commented 3 days ago

EH5iKiWUUAAF9T2

its finally running, after a week to try to figure it out, thank you for putting up with my dumbass, sorry if i missed instructions that seemed obvious from the perspective of someone who know what they're doing, its finally working,