daswer123 / xtts-webui

Webui for using XTTS and for finetuning it
MIT License
641 stars 119 forks source link

DeepSpeed isn't optional #38

Open xzuyn opened 9 months ago

xzuyn commented 9 months ago

Even though DeepSpeed has an argument, you cannot run the program without it being installed. Even after removing the DeepSpeed imports it still wants to use DeepSpeed stuff, so you can never make it to the UI.

This means ROCm users are left out. I think DeepSpeed support may have been added in ROCm 6.0, but it's definitely not in ROCm 5.7 or below. The rest of the program likely works fine, as xtts-api and other projects work fine. It's just DeepSpeed which is preventing it from working.

If you could please make it so that DeepSpeed only gets imported when its needed, and also only if --deepspeed is used, I would be very grateful.

M4TH1EU commented 8 months ago

Hey mate!

I've created an easy docker image to run this UI (and others) really easily with an AMD GPU under Linux. https://github.com/M4TH1EU/ai-suite-rocm/tree/main

I haven't been able to make deepspeed work though, some errors during compilation. But personally, it works wonders without it.

++

GUUser91 commented 7 months ago

I found a workaround to launch xtts-webui without deepspeed. First I uninstall deepspeed.

pip uninstall deepspeed

Then I launch it with this command.

python app.py

I also want to give out a tip for any AMD user trying to fine tune a model. Since xtts-webui uses faster whisper and since that only works on Nvidia graphics card, you should create a folder in the finetuned_models. In this case, mine was called Hiccup. I create a dataset folder inside the Hiccup folder. Inside the dataset folder is a wavs folder containing wavs files. In the dataset folder, you should create a lang.txt, metadata_eval.csv and metadata_train.csv files. Here are the files showing you how they should be formated. lang.txt metadata_eval.csv metadata_train.csv Launch xtts-webui. Create a fine tune model with the same name as the one in the finetuned_models directory, in this case mine was called Hiccup. I placed just 1 wav file in the "drop file here" box. I then click on load params from output folder button. Then I clicked train.

Edit: If you get this error message:

FileNotFoundError: [Errno 2] No such file or directory: 'finetuned_models/Hiccup/ready/reference.wav'

Then you need to copy a wav file from the wavs folder and put it in the ready folder. Rename the wav file to reference. Click on load params from output and then click on train.

Juniorduc44 commented 4 months ago

I went to the funcs.py file in

xtts-webui/scripts/funcs.py

On line 4 i commented out the following

"from scripts.resemble_enhance.enhancer.inference import denoise, enhance"

Then just in case someone in the file was looking for anything i added before the first function

enhance=None

System Specs