Plachtaa / VITS-fast-fine-tuning

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
Apache License 2.0
4.75k stars 713 forks source link

提示 This model doesn't have language tokens so it can't perform lang id #474

Closed heiheiheibj closed 1 year ago

heiheiheibj commented 1 year ago

python scripts/long_audio_transcribe.py --languages "CJ" --whisper_size large <generator object _walk at 0x7f3b9dec0820> ['gdg_4.wav', 'gdg_1.wav', 'gdg_5.wav', 'gdg_2.wav', 'gdg_8.wav', 'gdg_9.wav', 'gdg_7.wav', 'gdg_6.wav', 'gdg_3.wav'] filelist= ['gdg_4.wav', 'gdg_1.wav', 'gdg_5.wav', 'gdg_2.wav', 'gdg_8.wav', 'gdg_9.wav', 'gdg_7.wav', 'gdg_6.wav', 'gdg_3.wav'] transcribing ./denoised_audio/gdg_4.wav...

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained. Traceback (most recent call last): File "/home/ubuntu/VITS-fast-fine-tuning/scripts/long_audio_transcribe.py", line 45, in result = model.transcribe(parent_dir + file, word_timestamps=True, *transcribeoptions) File "/root/anaconda3/envs/barkvoice/lib/python3.10/site-packages/whisper/transcribe.py", line 130, in transcribe , probs = model.detect_language(mel_segment) File "/root/anaconda3/envs/barkvoice/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(args, **kwargs) File "/root/anaconda3/envs/barkvoice/lib/python3.10/site-packages/whisper/decoding.py", line 40, in detect_language raise ValueError( ValueError: This model doesn't have language tokens so it can't perform lang id

已经下载过 wget https://huggingface.co/spaces/sayashi/vits-uma-genshin-honkai/resolve/main/model/D_0-p.pth -O ./pretrained_models/D_0.pth wget https://huggingface.co/spaces/sayashi/vits-uma-genshin-honkai/resolve/main/model/G_0-p.pth -O ./pretrained_models/G_0.pth wget https://huggingface.co/spaces/sayashi/vits-uma-genshin-honkai/resolve/main/model/config.json -O ./configs/finetune_speaker.json

以前是可跑的,重装了UBUNTU 22.04就出现这问题了。谢谢

Json0926 commented 1 year ago

解决了吗

heiheiheibj commented 1 year ago

解决了吗

没有。。。模型也重下了,一个样

mikeyang01 commented 1 year ago

我出现过类似问题, 发现pypi的whisper版本太老, 需要从github下载最新的whisper

heiheiheibj commented 1 year ago

我出现过类似问题, 发现pypi的whisper版本太老, 需要从github下载最新的whisper

感谢!从GIT上下whisper装上后好使了 装whisper完后还有点小插曲,和torchaudio、torchvision 的新版本有冲突,降级到如下情况好使了。再次感谢 torch 2.0.1 torchaudio 2.0.1 torchvision 0.15.2 triton 2.0.0

Json0926 commented 1 year ago

强的 pip install git+https://github.com/openai/whisper.git