RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
MIT License
33.48k stars 3.84k forks source link

Faster Whisper ASR model downloaded locally but failed to load #860

Open etdapt opened 6 months ago

etdapt commented 6 months ago

I have downloaded the Faster Whisper ASR models in the tools/asr/models/ folder but still fail to load them .. only encounter with the resource_tracker warning and not going further. I have tried v2, v3, base, base.en ...etc. same results

Running on local URL: http://0.0.0.0:9874 "/opt/anaconda3/envs/GPTSoVits/bin/python3" tools/asr/fasterwhisper_asr.py -i "/Users/erict/aitools/GPT-SoVITS/output/slicer_opt/eng" -o "output/asr_opt" -s base.en-local -l en -p float32 loading faster whisper model: base.en tools/asr/models/faster-whisper-base.en 0%| | 0/15 [00:00<?, ?it/s]/opt/anaconda3/envs/GPTSoVits/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '

v3ucn commented 6 months ago

Because the default is not the read tools/asr/models/ directory, you need to modify the code fasterwhisper_asr.py:

model = WhisperModel (model_path, device=device, compute_type=precision,download_root= "./tools/asr/models", local_files_only=False)
etdapt commented 6 months ago

I realized that problem isn't the downloading/availabliity of the model folder. But when it's related to fasterwhisper_asr.py line 77 throwing exceptions when looping over the segments returned from line62. This will never cause any issue if the audio to be trained from is Chinese because line76 will skip over it.

v3ucn commented 6 months ago

You can test the faster_whisper problem in Japanese, because it will switch back to FunAsr by default in Chinese.

KevinZhang19870314 commented 5 months ago

Same issue here, any updates? I am using mac air m2 16G.

KevinZhang19870314 commented 5 months ago

Here is the log:

"/opt/anaconda3/envs/GPTSoVits/bin/python" tools/asr/fasterwhisper_asr.py -i "/Users/kevinzhang/Desktop/GPT-SoVITS/output/slicer_opt" -o "output/asr_opt" -s large-v3-local -l auto -p float32
loading faster whisper model: large-v3 tools/asr/models/faster-whisper-large-v3
  0%|                                                                                                                                | 0/2 [00:00<?, ?it/s]Intel MKL WARNING: Support of Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) enabled only processors has been deprecated. Intel oneAPI Math Kernel Library 2025.0 will require Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
 50%|████████████████████████████████████████████████████████████                                                            | 1/2 [00:29<00:29, 29.79s/it]/opt/anaconda3/envs/GPTSoVits/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

PS: I am already downloaded the fastwhisper v3 model. 261d4fbcb9b59fd80403b818fdad0363

3e5df063f3b9c3248d61bb86fc658d81

etdapt commented 5 months ago

I have been trying to trace through faster_whisper, I think it's faster_whisper's problem. Only chance is to drop in another ASR model.

SapphireLab commented 5 months ago

did you install the conda in x86_64 way? can you do some check? if x86_64, it may cause problem in CTtranslated2 to stop the faster whisper. you can find the similar issue in fasterwisper issue 305.

etdapt commented 5 months ago

did you install the conda in x86_64 way? can you do some check? if x86_64, it may cause problem in CTtranslated2 to stop the faster whisper. you can find the similar issue in fasterwisper issue 305.

I myself on Intel-based Mac, not sure if that's same as that issue 305

SapphireLab commented 5 months ago

I myself on Intel-based Mac, not sure if that's same as that issue 305

okay, I get it, your problem is more similar to #1032.😥

XXXXRT666 commented 5 months ago

this might help https://github.com/SYSTRAN/faster-whisper/issues/345

XXXXRT666 commented 4 months ago

After gradually troubleshooting, I think that this seems to be related to return self.model.encode(features, to_cpu=to_cpu), which is located in encode within faster_whisper.WhisperModel.transcribe and calls ctranslate2. However, the error occurs at different positions for many times, making it difficult to pinpoint the specific error. Moreover, the exit position of the program varies with different language selections, and even under the same conditions, the exit position can sometimes differ upon repeated calls. However, it often exits at encode.

XXXXRT666 commented 4 months ago

I haven't encountered this error myself. I am using an Intel Mac from a QQ group member who has experienced this error. I am using a modified transcribe.py with added print statements to troubleshoot.