erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
GNU Affero General Public License v3.0
864 stars 98 forks source link

Ja language finetune doesn't work, but En works. RecursionError and PermissionError. #134

Closed Humppa124 closed 5 months ago

Humppa124 commented 5 months ago

Describe the bug Japanese language finetuning doesn't work but English finetune works. I noticed there was already an issue about Ja language finetuning, but the errors and problems seem different so opened new one.

To Reproduce Python finetune.py -> Step 1, whisper model small, Ja language, evaluation data 15, generate -> Step 2 -> Issue during step 2.

Screenshots kuva kuva

Last error in English: PermissionError: [WinError 32] The process cannot access the file because it is being used by another process

Desktop (please complete the following information): AllTalk was updated: [19.3.2024] Conda enviroment, custom install. Windows 10

Additional context Rebooted the PC to try to make sure there is no other processes that use Alltalk_tts files. To no avail. Clearing tmp-trn folder neither helps.

Diagnostics log: diagnostics.log

erew123 commented 5 months ago

Hi @Humppa124

You error message off a little early. Can you either re-run it or can you confirm to me on your dataset how many files whisper cut the audio files into?

Additionally, you say you have a custom python environment setup here? I notice you are running the files from a path that has a space in the name "c:\program files......" I cant say how much of an issue this may be, as we are dealing with a variety of training scripts that I didnt write. What I can say, is that miniconda and conda dont like dealing with paths with a space in them.

e.g. https://www.reddit.com/r/learnpython/comments/x3h2zq/why_does_some_python_software_not_allow_white/

https://docs.anaconda.com/free/working-with-conda/reference/faq/#installing-anaconda

Would you like to test installing AllTalk in a folder path such as c:\alltalk_tts and having its environment built using the atsetup.bat utility, where the environment is installed directly below c:\alltalk_tts and see if that resolves the issue.

Thanks

Humppa124 commented 5 months ago

Oops, forgot about Conda not liking spaces in path. Now alltalk_tts is in path containing no spaces. Now it "works" or at least get different error but apparently doesn't like my audio data. I threw 7 min audio file at it and after filtering 0 eval data. Changed to large_v2 model. I think whisper cut the audio files into 7 new files or so, but already cleared my dataset as I try a longer file now. Atsetup.bat just quits after I choose the requirements and try to begin the install. Probably because I have python only through conda?

kuva kuva

Will continue with either trying even longer audio file or different file, but creating dataset out of longer audio files takes really long on my PC.

Humppa124 commented 5 months ago

Yeah apparently, doesn't like my audio files. Will open again if further issues arise. Thanks for reminding me about Conda not liking spaces in path.