erew123 / alltalk_tts

AllTalk is based on the Coqui TTS engine, similar to the Coqui_tts extension for Text generation webUI, however supports a variety of advanced features, such as a settings page, low VRAM support, DeepSpeed, narrator, model finetuning, custom models, wav file maintenance. It can also be used with 3rd Party software via JSON calls.
GNU Affero General Public License v3.0
816 stars 91 forks source link

AllTalk on Fedora & setting up DeepSpeed #141

Closed erew123 closed 5 months ago

erew123 commented 5 months ago

To answer the question of "Does AllTalk run on Fedora" and "Can you configure DeepSpeed" Ive run a test in a Virtual Machine.

Please note, that I did not setup the Nvidia CUDA Toolkit 11.8 so that I could perform Finetuning, I only used the Nvidia CUDA Toolkit 12.3 as it was simple to install and I cannot test Finetuning within a Virtual Machine. But Finetuning needs the Nvidia CUDA Toolkit v11.8 to be installed and correctly configured.

TLDR: Yes it works fine

All the steps below are from following:

from the main page, however, I did use Fedora's own NVIDIA CUDA Toolkit installation for simplicity.

Cloned the Github

image

Allowed atsetup.sh permission to run and ran it, selecting Standalone Installation and Option 1

image

Let the Install run through

image

Installation completed fine

image

Exited atsetup.sh and ran ./start_alltalk.sh to see if it would start up fine and it did

image

It was a little slow in a Virtual Machine, but it did start and load the model without errors

image

Next I went to compile and install DeepSpeed. To keep it simple, I followed the "Fedora 37 and later" instructions from here https://rpmfusion.org/Howto/CUDA

I ran the CUDA install from RPMFusion

image

Waited for that to complete

image

Followed through the Linux Standalone Instructions for DeepSpeed

image

Tested that it was working with nvcc --version

image

Ran the pip install deepspeed which successfully compiled

image

Started AllTalk with ./start_alltalk.sh. It was slow being on a Virtual Machine, however it detected AllTalk and loaded the model in correctly. It did give 1x error about NVML which is to do with the Virtual Machine environment

image

**Loaded up the Settings page and tested TTS, which worked fine bar another VM related error NNPACK***

image