RVC-Project / Retrieval-based-Voice-Conversion-WebUI

Easily train a good VC model with voice data <= 10 mins!
MIT License
24.49k stars 3.61k forks source link

console show Training is done. The program is closed, but weight folder is not appear #2079

Closed maudslice closed 5 months ago

maudslice commented 5 months ago

I tried using one click training, and the error message after the training is as follows: INFO:yurika-test01:Training is done. The program is closed. INFO:yurika-test01:saving final ckpt:Success. Process Process-2: Traceback (most recent call last): File "/home/chenzhr/anaconda3/envs/rvc/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/home/chenzhr/anaconda3/envs/rvc/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/chenzhr/Retrieval-based-Voice-Conversion-WebUI/infer/modules/train/train.py", line 282, in run train_and_evaluate( File "/home/chenzhr/Retrieval-based-Voice-Conversion-WebUI/infer/modules/train/train.py", line 481, in train_and_evaluate scaler.scale(loss_disc).backward() File "/home/chenzhr/Retrieval-based-Voice-Conversion-WebUI/.venv/lib/python3.8/site-packages/torch/_tensor.py", line 525, in backward torch.autograd.backward( File "/home/chenzhr/Retrieval-based-Voice-Conversion-WebUI/.venv/lib/python3.8/site-packages/torch/autograd/__init__.py", line 267, in backward _engine_run_backward( File "/home/chenzhr/Retrieval-based-Voice-Conversion-WebUI/.venv/lib/python3.8/site-packages/torch/autograd/graph.py", line 744, in _engine_run_backward return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: [../third_party/gloo/gloo/transport/tcp/pair.cc:534] Connection closed by peer [127.0.1.1]:4506 /home/chenzhr/anaconda3/envs/rvc/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 20 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d ' I don't know if this error message is normal. I checked the common file answer page and tried to use the training feature index button again, but the error still occurred, The error message displayed on the UI interface is as follows:

maudslice commented 5 months ago

continuing: (40728, 768),1044 training adding 成功构建索引 added_IVF1044_Flat_nprobe_1_yurika-test01_v2.index 链接索引到外部-assets/indices失败 In addition, there were other error messages during the entire one click training process, and I am not sure if it is related to my failure. Please post all of them below: Retrieval-based-Voice-Conversion-WebUI/.venv/lib/python3.8/site-packages/torch/nn/m odules/conv.py:306: UserWarning: Plan failed with a cudnnException: CUDNN_BACKEND_EXECUTION_PLAN_ DESCRIPTOR: cudnnFinalize Descriptor Failed cudnn_status: CUDNN_STATUS_NOT_SUPPORTED (Triggered i nternally at ../aten/src/ATen/native/cudnn/Conv_v8.cpp:919.) return F.conv1d(input, weight, bias, self.stride, OS: ubuntu20 python: 3.8 Thank you very much to anyone who can help me!