SayaSS / vits-finetuning

Fine-Tuning your VITS model using a pre-trained model
MIT License
546 stars 86 forks source link

RuntimeError: cuFFT error: CUFFT_INTERNAL_ERROR #48

Open zackrack opened 6 months ago

zackrack commented 6 months ago

Hi,

When I run python train_ms.py -c configs/config.json -m checkpoints

I get the below stack trace. I've tried setting all versions of torch, CUDA, and other libraries compatible with each other. I am using Python 3.7, CUDA 11.7, torch 1.13.1, torchaudio 0.13.1, torchvision 0.14.1, and cudnn-cuda-11. I am using a python3.7 venv as well. This is on my local machine with an RTX 4080. I am not running out of GPU memory or system memory.

I believe all my wavs and list files are formatted correctly, and I have followed all other instructions in README.md.

I am running on WSL 2 Ubuntu 22.04.4 LTS.

Thank you for your help.

Traceback (most recent call last):
  File "train_ms.py", line 306, in <module>
    main()
  File "train_ms.py", line 56, in main
    mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,))
  File "/home/user/vits-3.7-venv/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/home/user/vits-3.7-venv/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
    while not context.join():
  File "/home/user/vits-3.7-venv/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 160, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException: 

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/home/user/vits-3.7-venv/lib/python3.7/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
    fn(i, *args)
  File "/home/user/vits-finetuning/train_ms.py", line 124, in run
    train_and_evaluate(rank, epoch, hps, [net_g, net_d], [optim_g, optim_d], [scheduler_g, scheduler_d], scaler, [train_loader, eval_loader], logger, [writer, writer_eval])
  File "/home/user/vits-finetuning/train_ms.py", line 170, in train_and_evaluate
    hps.data.mel_fmax
  File "/home/user/vits-finetuning/mel_processing.py", line 105, in mel_spectrogram_torch
    center=center, pad_mode='reflect', normalized=False, onesided=True, return_complex=False)
  File "/home/user/vits-3.7-venv/lib/python3.7/site-packages/torch/functional.py", line 633, in stft
    normalized, onesided, return_complex)
RuntimeError: cuFFT error: CUFFT_INTERNAL_ERROR
littlestronomer commented 2 months ago

If I remember correctly, I have just modified the requirements.txt and this error resolved. Maybe you can try to update the versions.

But just a note here, it kept giving error TensorRT is not found, even I imported it.