coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
http://coqui.ai
Mozilla Public License 2.0
31.64k stars 3.78k forks source link

Gradio Live , Create Dataset gives an error : ValueError: Requested float16 compute type, but the target device or backend do not support efficient float16 computation. #3790

Open Rakshasv18 opened 2 weeks ago

Rakshasv18 commented 2 weeks ago

Describe the bug

Dataset building + XTTS finetuning and inference in google colab

it requires : pip install transformers -U along with other packages to run smoothly.

When i try to upload my data one was 7.9mb and other was 157 mb data of mp3 and wav resp. The first step is to create dataset , when i try to run i get the below error :

Traceback (most recent call last): File "/content/TTS/TTS/demos/xtts_ft_demo/xtts_demo.py", line 215, in preprocess_dataset train_meta, eval_meta, audio_total_size = format_audio_list(audio_path, target_language=language, out_path=out_path, gradio_progress=progress) File "/content/TTS/TTS/demos/xtts_ft_demo/utils/formatter.py", line 56, in format_audio_list asr_model = WhisperModel("large-v2", device=device, compute_type="float16") File "/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py", line 128, in init self.model = ctranslate2.models.Whisper( ValueError: Requested float16 compute type, but the target device or backend do not support efficient float16 computation. Loading Whisper Model! Traceback (most recent call last): File "/content/TTS/TTS/demos/xtts_ft_demo/xtts_demo.py", line 215, in preprocess_dataset train_meta, eval_meta, audio_total_size = format_audio_list(audio_path, target_language=language, out_path=out_path, gradio_progress=progress) File "/content/TTS/TTS/demos/xtts_ft_demo/utils/formatter.py", line 56, in format_audio_list asr_model = WhisperModel("large-v2", device=device, compute_type="float16") File "/usr/local/lib/python3.10/dist-packages/faster_whisper/transcribe.py", line 128, in init self.model = ctranslate2.models.Whisper( ValueError: Requested float16 compute type, but the target device or backend do not support efficient float16 computation.

To Reproduce

Try the google colab notebook

Dataset building + XTTS finetuning and inference Running the demo To start the demo run the first two cells (ignore pip install errors in the first one)

Then click on the link Running on public URL: when the demo is ready.

Downloading the results You can run cell [3] to zip and download default dataset path

You can run cell [4] to zip and download the latest model you trained

Expected behavior

Dataset along with transcriptions to fine tune the model

Logs

No response

Environment

Google Colab

Additional context

No response