babysor / MockingBird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
Other
34.88k stars 5.18k forks source link

自己训练hifigan声码器何时能结束? #871

Open hujb2000 opened 1 year ago

hujb2000 commented 1 year ago

日志如下: C:\ProgramData\Anaconda3\envs\mockingbird\python.exe E:\workspace\MockingBird\control\cli\vocoder_train.py my_run e:\datasets hifigan -m E:\workspace\MockingBird\data\ckpt\vocoder\saved_models Arguments: run_id: my_run vocoder_type: hifigan syn_dir: e:\datasets\SV2TTS\synthesizer voc_dir: e:\datasets\SV2TTS\vocoder models_dir: E:\workspace\MockingBird\data\ckpt\vocoder\saved_models ground_truth: False save_every: 1000 backup_every: 25000 force_restart: False config: models/vocoder/hifigan/config16k.json

Generator( (conv_pre): Conv1d(80, 512, kernel_size=(7,), stride=(1,), padding=(3,)) (ups): ModuleList( (0): ConvTranspose1d(512, 256, kernel_size=(10,), stride=(5,), padding=(3,), output_padding=(1,)) (1): ConvTranspose1d(256, 128, kernel_size=(10,), stride=(5,), padding=(3,), output_padding=(1,)) (2): ConvTranspose1d(128, 64, kernel_size=(8,), stride=(4,), padding=(2,)) (3): ConvTranspose1d(64, 32, kernel_size=(4,), stride=(2,), padding=(1,)) ) (resblocks): ModuleList( (0): ResBlock1( (convs1): ModuleList( (0): Conv1d(256, 256, kernel_size=(3,), stride=(1,), padding=(1,)) (1): Conv1d(256, 256, kernel_size=(3,), stride=(1,), padding=(3,), dilation=(3,)) (2): Conv1d(256, 256, kernel_size=(3,), stride=(1,), padding=(5,), dilation=(5,)) ) (convs2): ModuleList( (0-2): 3 x Conv1d(256, 256, kernel_size=(3,), stride=(1,), padding=(1,)) ) ) (1): ResBlock1( (convs1): ModuleList( (0): Conv1d(256, 256, kernel_size=(7,), stride=(1,), padding=(3,)) (1): Conv1d(256, 256, kernel_size=(7,), stride=(1,), padding=(9,), dilation=(3,)) (2): Conv1d(256, 256, kernel_size=(7,), stride=(1,), padding=(15,), dilation=(5,)) ) (convs2): ModuleList( (0-2): 3 x Conv1d(256, 256, kernel_size=(7,), stride=(1,), padding=(3,)) ) ) (2): ResBlock1( (convs1): ModuleList( (0): Conv1d(256, 256, kernel_size=(11,), stride=(1,), padding=(5,)) (1): Conv1d(256, 256, kernel_size=(11,), stride=(1,), padding=(15,), dilation=(3,)) (2): Conv1d(256, 256, kernel_size=(11,), stride=(1,), padding=(25,), dilation=(5,)) ) (convs2): ModuleList( (0-2): 3 x Conv1d(256, 256, kernel_size=(11,), stride=(1,), padding=(5,)) ) ) (3): ResBlock1( (convs1): ModuleList( (0): Conv1d(128, 128, kernel_size=(3,), stride=(1,), padding=(1,)) (1): Conv1d(128, 128, kernel_size=(3,), stride=(1,), padding=(3,), dilation=(3,)) (2): Conv1d(128, 128, kernel_size=(3,), stride=(1,), padding=(5,), dilation=(5,)) ) (convs2): ModuleList( (0-2): 3 x Conv1d(128, 128, kernel_size=(3,), stride=(1,), padding=(1,)) ) ) (4): ResBlock1( (convs1): ModuleList( (0): Conv1d(128, 128, kernel_size=(7,), stride=(1,), padding=(3,)) (1): Conv1d(128, 128, kernel_size=(7,), stride=(1,), padding=(9,), dilation=(3,)) (2): Conv1d(128, 128, kernel_size=(7,), stride=(1,), padding=(15,), dilation=(5,)) ) (convs2): ModuleList( (0-2): 3 x Conv1d(128, 128, kernel_size=(7,), stride=(1,), padding=(3,)) ) ) (5): ResBlock1( (convs1): ModuleList( (0): Conv1d(128, 128, kernel_size=(11,), stride=(1,), padding=(5,)) (1): Conv1d(128, 128, kernel_size=(11,), stride=(1,), padding=(15,), dilation=(3,)) (2): Conv1d(128, 128, kernel_size=(11,), stride=(1,), padding=(25,), dilation=(5,)) ) (convs2): ModuleList( (0-2): 3 x Conv1d(128, 128, kernel_size=(11,), stride=(1,), padding=(5,)) ) ) (6): ResBlock1( (convs1): ModuleList( (0): Conv1d(64, 64, kernel_size=(3,), stride=(1,), padding=(1,)) (1): Conv1d(64, 64, kernel_size=(3,), stride=(1,), padding=(3,), dilation=(3,)) (2): Conv1d(64, 64, kernel_size=(3,), stride=(1,), padding=(5,), dilation=(5,)) ) (convs2): ModuleList( (0-2): 3 x Conv1d(64, 64, kernel_size=(3,), stride=(1,), padding=(1,)) ) ) (7): ResBlock1( (convs1): ModuleList( (0): Conv1d(64, 64, kernel_size=(7,), stride=(1,), padding=(3,)) (1): Conv1d(64, 64, kernel_size=(7,), stride=(1,), padding=(9,), dilation=(3,)) (2): Conv1d(64, 64, kernel_size=(7,), stride=(1,), padding=(15,), dilation=(5,)) ) (convs2): ModuleList( (0-2): 3 x Conv1d(64, 64, kernel_size=(7,), stride=(1,), padding=(3,)) ) ) (8): ResBlock1( (convs1): ModuleList( (0): Conv1d(64, 64, kernel_size=(11,), stride=(1,), padding=(5,)) (1): Conv1d(64, 64, kernel_size=(11,), stride=(1,), padding=(15,), dilation=(3,)) (2): Conv1d(64, 64, kernel_size=(11,), stride=(1,), padding=(25,), dilation=(5,)) ) (convs2): ModuleList( (0-2): 3 x Conv1d(64, 64, kernel_size=(11,), stride=(1,), padding=(5,)) ) ) (9): ResBlock1( (convs1): ModuleList( (0): Conv1d(32, 32, kernel_size=(3,), stride=(1,), padding=(1,)) (1): Conv1d(32, 32, kernel_size=(3,), stride=(1,), padding=(3,), dilation=(3,)) (2): Conv1d(32, 32, kernel_size=(3,), stride=(1,), padding=(5,), dilation=(5,)) ) (convs2): ModuleList( (0-2): 3 x Conv1d(32, 32, kernel_size=(3,), stride=(1,), padding=(1,)) ) ) (10): ResBlock1( (convs1): ModuleList( (0): Conv1d(32, 32, kernel_size=(7,), stride=(1,), padding=(3,)) (1): Conv1d(32, 32, kernel_size=(7,), stride=(1,), padding=(9,), dilation=(3,)) (2): Conv1d(32, 32, kernel_size=(7,), stride=(1,), padding=(15,), dilation=(5,)) ) (convs2): ModuleList( (0-2): 3 x Conv1d(32, 32, kernel_size=(7,), stride=(1,), padding=(3,)) ) ) (11): ResBlock1( (convs1): ModuleList( (0): Conv1d(32, 32, kernel_size=(11,), stride=(1,), padding=(5,)) (1): Conv1d(32, 32, kernel_size=(11,), stride=(1,), padding=(15,), dilation=(3,)) (2): Conv1d(32, 32, kernel_size=(11,), stride=(1,), padding=(25,), dilation=(5,)) ) (convs2): ModuleList( (0-2): 3 x Conv1d(32, 32, kernel_size=(11,), stride=(1,), padding=(5,)) ) ) ) (conv_post): Conv1d(32, 1, kernel_size=(7,), stride=(1,), padding=(3,)) ) checkpoints directory : E:\workspace\MockingBird\data\ckpt\vocoder\saved_models\my_run_hifigan Epoch: 1 C:\Users\Administrator\AppData\Roaming\Python\Python39\site-packages\torch\functional.py:641: UserWarning: stft with return_complex=False is deprecated. In a future pytorch release, stft will return complex tensors for all inputs, and return_complex=False will raise an error. Note: you can still call torch.view_as_real on the complex output to recover the old return format. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\SpectralOps.cpp:867.) return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined] C:\Users\Administrator\AppData\Roaming\Python\Python39\site-packages\torch\functional.py:641: UserWarning: stft with return_complex=False is deprecated. In a future pytorch release, stft will return complex tensors for all inputs, and return_complex=False will raise an error. Note: you can still call torch.view_as_real on the complex output to recover the old return format. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\SpectralOps.cpp:867.) return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined] C:\Users\Administrator\AppData\Roaming\Python\Python39\site-packages\torch\functional.py:641: UserWarning: stft with return_complex=False is deprecated. In a future pytorch release, stft will return complex tensors for all inputs, and return_complex=False will raise an error. Note: you can still call torch.view_as_real on the complex output to recover the old return format. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\SpectralOps.cpp:867.) return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined] C:\Users\Administrator\AppData\Roaming\Python\Python39\site-packages\torch\functional.py:641: UserWarning: stft with return_complex=False is deprecated. In a future pytorch release, stft will return complex tensors for all inputs, and return_complex=False will raise an error. Note: you can still call torch.view_as_real on the complex output to recover the old return format. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\SpectralOps.cpp:867.) return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined] C:\Users\Administrator\AppData\Roaming\Python\Python39\site-packages\torch\functional.py:641: UserWarning: stft with return_complex=False is deprecated. In a future pytorch release, stft will return complex tensors for all inputs, and return_complex=False will raise an error. Note: you can still call torch.view_as_real on the complex output to recover the old return format. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\SpectralOps.cpp:867.) return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined] Steps : 0, Gen Loss Total : 108.894, Mel-Spec. Error : 2.420, s/b : 4.006 Saving checkpoint to E:\workspace\MockingBird\data\ckpt\vocoder\saved_models\my_run_hifigan/g_hifigan.pt Complete. Saving checkpoint to E:\workspace\MockingBird\data\ckpt\vocoder\saved_models\my_run_hifigan/do_hifigan.pt Complete. C:\Users\Administrator\AppData\Roaming\Python\Python39\site-packages\torch\functional.py:641: UserWarning: stft with return_complex=False is deprecated. In a future pytorch release, stft will return complex tensors for all inputs, and return_complex=False will raise an error. Note: you can still call torch.view_as_real on the complex output to recover the old return format. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\SpectralOps.cpp:867.) return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined] min value is tensor(-1.0063) max value is tensor(1.0012) min value is tensor(-1.0209) max value is tensor(1.0101) min value is tensor(-1.0140) max value is tensor(1.0183) max value is tensor(1.0032) min value is tensor(-1.0050) min value is tensor(-1.0193) min value is tensor(-1.0559) min value is tensor(-1.0963) max value is tensor(1.0976) Steps : 5, Gen Loss Total : 89.431, Mel-Spec. Error : 1.926, s/b : 1.124 Steps : 10, Gen Loss Total : 81.086, Mel-Spec. Error : 1.735, s/b : 1.120 Steps : 15, Gen Loss Total : 78.217, Mel-Spec. Error : 1.641, s/b : 1.121 Steps : 20, Gen Loss Total : 88.838, Mel-Spec. Error : 1.844, s/b : 1.166 Steps : 25, Gen Loss Total : 91.468, Mel-Spec. Error : 1.795, s/b : 1.123 min value is tensor(-1.0038) Steps : 30, Gen Loss Total : 81.358, Mel-Spec. Error : 1.619, s/b : 1.123 Steps : 35, Gen Loss Total : 81.096, Mel-Spec. Error : 1.561, s/b : 1.166 Steps : 40, Gen Loss Total : 85.270, Mel-Spec. Error : 1.656, s/b : 1.156 Steps : 45, Gen Loss Total : 85.788, Mel-Spec. Error : 1.671, s/b : 1.159 Steps : 50, Gen Loss Total : 81.539, Mel-Spec. Error : 1.538, s/b : 1.117 min value is tensor(-1.0265) max value is tensor(1.0530) Steps : 55, Gen Loss Total : 78.400, Mel-Spec. Error : 1.483, s/b : 1.122 min value is tensor(-1.0625) max value is tensor(1.0293) Steps : 1000, Gen Loss Total : 76.684, Mel-Spec. Error : 1.461, s/b : 1.154 Saving checkpoint to E:\workspace\MockingBird\data\ckpt\vocoder\saved_models\my_run_hifigan/g_hifigan.pt Complete. Saving checkpoint to E:\workspace\MockingBird\data\ckpt\vocoder\saved_models\my_run_hifigan/do_hifigan.pt Complete. C:\Users\Administrator\AppData\Roaming\Python\Python39\site-packages\torch\functional.py:641: UserWarning: stft with return_complex=False is deprecated. In a future pytorch release, stft will return complex tensors for all inputs, and return_complex=False will raise an error. Note: you can still call torch.view_as_real on the complex output to recover the old return format. (Triggered internally at C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\aten\src\ATen\native\SpectralOps.cpp:867.) return _VF.stft(input, n_fft, hop_length, win_length, window, # type: ignore[attr-defined] Steps : 1380, Gen Loss Total : 77.188, Mel-Spec. Error : 1.468, s/b : 1.160 Steps : 1385, Gen Loss Total : 75.907, Mel-Spec. Error : 1.452, s/b : 1.161 Steps : 1390, Gen Loss Total : 70.489, Mel-Spec. Error : 1.303, s/b : 1.159 Steps : 1395, Gen Loss Total : 70.901, Mel-Spec. Error : 1.378, s/b : 1.162 Steps : 1400, Gen Loss Total : 69.194, Mel-Spec. Error : 1.249, s/b : 1.175 中间删除掉了一些Steps日志

HaSaKiYasuooo commented 1 year ago

你好请问这个问题如何解决

HaSaKiYasuooo commented 1 year ago

是需要-m 参数来指定保存路径才能有模型码,现在我的vocoder只有log文件

HaSaKiYasuooo commented 1 year ago

错误mel() takes 0 positional arguments but 5 were given https://stackoverflow.com/questions/75796284/typeerror-mel-takes-0-positional-arguments-but-5-were-given 修改utils/audio_utils.py