PlayVoice / vits_chinese

Best practice TTS based on BERT and VITS with some Natural Speech Features Of Microsoft; Support ONNX streaming out!
https://huggingface.co/spaces/maxmax20160403/vits_chinese
MIT License
1.16k stars 168 forks source link

老师您好,请问您迭代了多少iteration? #87

Closed liumingda closed 1 year ago

liumingda commented 1 year ago

1、老师您好,配置文件我只将16000换为48000,其余均没有改变,目前训练了730000个iteration, 推理结果没有您给的例子效果好,想知道您例子的模型训练了多少iteration?

2、还有一个问题就是我加入了自己的训练集,需要在text.symbols中_tones = ["1", "2", "3", "4", "5"]变为_tones = ["1", "2", "3", "4", "5", "6"],但这样训练就会报这个错: expavg.mul(beta1).add_(grad, alpha=1 - beta1) RuntimeError: The size of tensor a (219) must match the size of tensor b (257) at non-singleton dimension 0 其中exp_avg_shape: torch.Size([219, 192]), grad_shape: torch.Size([257, 192]), 就不知道为什么exp_avg的大小没有对应改过来,谢谢老师!

MaxMax2016 commented 1 year ago

1,时间有点久了、记不太清了;好像是24 batch_size & 500K iteration;具体还是要看kl_loss和mel_loss确定是不是训练时间不够;另外,采样率调大后,segment_size也应该适当调大。 2,改text.symbols等于改模型,需要从头训练模型;exp_avg_shape我没定位到对应的代码,有更详细的信息吗?

liumingda commented 1 year ago

谢谢老师,exp_avg_shape对应的代码在torch>optim>_functional.py下的adamw函数下,您能帮忙看下吗?

MaxMax2016 commented 1 year ago

把完整的错误信息贴出来吧

liumingda commented 1 year ago

Traceback (most recent call last): File "train.py", line 438, in main() File "train.py", line 43, in main mp.spawn( File "/opt/conda/envs/diffsinger/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn return start_processes(fn, args, nprocs, join, daemon, start_method='spawn') File "/opt/conda/envs/diffsinger/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes while not context.join(): File "/opt/conda/envs/diffsinger/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 150, in join raise ProcessRaisedException(msg, error_index, failed_process.pid) torch.multiprocessing.spawn.ProcessRaisedException:

-- Process 0 terminated with the following error: Traceback (most recent call last): File "/opt/conda/envs/diffsinger/lib/python3.8/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap fn(i, args) File "/home/notebook/code/personal/S9052934/vits/tts/vits_chinese-master/train.py", line 159, in run train_and_evaluate( File "/home/notebook/code/personal/S9052934/vits/tts/vits_chinese-master/train.py", line 277, in train_and_evaluate scaler.step(optim_g) File "/opt/conda/envs/diffsinger/lib/python3.8/site-packages/torch/cuda/amp/grad_scaler.py", line 310, in step return optimizer.step(args, kwargs) File "/opt/conda/envs/diffsinger/lib/python3.8/site-packages/torch/optim/lr_scheduler.py", line 65, in wrapper return wrapped(*args, *kwargs) File "/opt/conda/envs/diffsinger/lib/python3.8/site-packages/torch/optim/optimizer.py", line 88, in wrapper return func(args, kwargs) File "/opt/conda/envs/diffsinger/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/opt/conda/envs/diffsinger/lib/python3.8/site-packages/torch/optim/adamw.py", line 137, in step F.adamw(params_with_grad, File "/opt/conda/envs/diffsinger/lib/python3.8/site-packages/torch/optim/_functional.py", line 131, in adamw expavg.mul(beta1).add_(grad, alpha=1 - beta1) RuntimeError: The size of tensor a (219) must match the size of tensor b (257) at non-singleton dimension 0 老师,就是完整的错误信息啦,麻烦您啦!

MaxMax2016 commented 1 year ago

网络结构变了要从头训练?

liumingda commented 1 year ago

哦哦,但是网络结构我没有改变呀

MaxMax2016 commented 1 year ago

net_g = utils.load_class(hps.train.train_class)( len(symbols), hps.data.filter_length // 2 + 1, hps.train.segment_size // hps.data.hop_length, **hps.model, ).cuda(rank)

self.enc_p = TextEncoder( n_vocab, inter_channels, hidden_channels, filter_channels, n_heads, n_layers, kernel_size, p_dropout, )

self.emb = nn.Embedding(n_vocab, hidden_channels)

liumingda commented 1 year ago

哦哦,感谢老师,我研究研究!

godspirit00 commented 1 year ago

具体还是要看kl_loss和mel_loss确定是不是训练时间不够

@MaxMax2016 能否请问一下,如何确定训练时间够不够,还是已经过了?

MaxMax2016 commented 1 year ago

loss曲线平了,表示训练够了