erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
GNU Affero General Public License v3.0
944 stars 110 forks source link

AMD GPUs support #233

Closed Neoony closed 4 months ago

Neoony commented 4 months ago

Is your feature request related to a problem? Please describe. No AMD support

Describe the solution you'd like Any plans for AMD support on windows? A lot of other AI software now has either DirectML, ROCm or Zluda implementations. XTTS/alltalk_tts is great, but really quite slow on 5950X CPU e.g. Stable Diffusion with vladmandic UI can do DirectML, Zluda, or olive/ONNX LM Studio can use ROCm to run LLMs (Finally ROCm on windows for something)

Just missing a good TTS for AMD GPUs (I got 7900 XTX)

Not even for finetuning, but that would also be nice.

erew123 commented 4 months ago

Hi @Neoony

To be honest with you, I would love to! I know that Zluda should make CUDA calls and so CUDA based software should just work with it. Someone did give it a go recently, but said they couldnt get it to work.

Some of the TTS engines/scripts by the manufacturers of the TTS engines/models may or may not work with AMD... so there may be that challenge.

The biggest issue I face is that I dont have an AMD card (or a Mac M series either), so its not possible for me to easily test what does/doesnt work and debug.

With V2 on its way out soon (see discussions board) my intent was that, once I have the core of v2 done and up as a beta, I was going to ask people out there whom have a little coding experience, along with a Mac or AMD card, if they would be willing to take a shot at figuring the code needed to get it working. I have a couple of idea for whats needed, but no way to work through testing/debugging/testing/debugging etc etc.

On the flip side of that statement, there will be other TTS engines built into V2 that may have some native AMD support.... may

So the answer is a yes.... with caveats.

Thanks

RenNagasaki commented 4 months ago

@Neoony to add to what Erew said. At the moment it's just not possible. CoquiTTS and all other TTS I Know of are dependent on PyTorch. Which at the moment does not have a Window ROCM ready build. As long as thats not there we can't proceed. You can check for the status on that here; https://pytorch.org/blog/pytorch-for-amd-rocm-platform-now-available-as-python-package/

Kind regards, RenNagasaki

Neoony commented 4 months ago

Yeah, pretty much everything for ROCm on windows depends on the pytorch, heard it many times. But something like zluda or directml should not need that.

Hovewer I wonder how come that LM Studio can run LLMs on ROCm on Windows and it works absolutely amazingly well (I have absolutely no issues) (LLMs dont need pytorch?)

Never noticed there are active discussions in discussions section at this git repo, my bad.

Thx

Neoony commented 4 months ago

I did see someone mention on SD next discord that it might be coming soon however Not sure how true/correct

SmartSelect_20240524_105905_Discord.png

(but I guess 6.1 windows should be already out?)

fayalalebrun commented 3 months ago

Works without an issue on Linux with ROCm. Just had to install the ROCm version of pytorch.

Using Manjaro.

Neoony commented 2 months ago

Guess now that AMD supports GPU in WSL2, that might also work (SD.next works flawlessly) Gonna have to try sometime

(main issue currently is that not many GPUs are supported for that)

Neoony commented 2 months ago

V1 Seems to work in Windows on WSL2 with Ubuntu 22.04 and Radeon 7900 XTX Now that the AMD drivers support that ( but only some GPUs :( )

image

Uses my GPU

Instructions: Might not be proper instructions, took a while until I got there and not sure what is actually needed or not.

Got the requirements (already had it since I tried Stable Diffusion)

sudo apt update
wget https://repo.radeon.com/amdgpu-install/6.1.3/ubuntu/jammy/amdgpu-install_6.1.60103-1_all.deb
sudo apt install ./amdgpu-install_6.1.60103-1_all.deb
sudo amdgpu-install -y --usecase=wsl,rocm --no-dkms

sudo reboot

Then in alltalk-tts after its set up already

./start_environment.sh
pip3 uninstall torch torchaudio
pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm6.1

Then I also did this I think (based on SD Next instructions on Discord)

"Patch PyTorch"

location=`pip show torch | grep Location | awk -F ": " '{print $2}'`
cd ${location}/torch/lib/
rm libhsa-runtime64.so*
cp /opt/rocm/lib/libhsa-runtime64.so.1.2 libhsa-runtime64.so

And then I needed to also copy this file into the lib folders overwriting the one there

sudo cp -f /usr/lib/x86_64-linux-gnu/libstdc++.so.6 /somelocation/alltalk_tts/alltalk_environment/env/lib/
sudo cp -f /usr/lib/x86_64-linux-gnu/libstdc++.so.6 /somelocation/alltalk_tts/alltalk_environment/conda/lib/

just make sure to correct the path

And start it up and then it works. Something like that :P

This would take around a minute to generate on my 5950X CPU

I will put this into discussions (continued there) https://github.com/erew123/alltalk_tts/discussions/132#discussioncomment-10143413