r9y9 / wavenet_vocoder

WaveNet vocoder
https://r9y9.github.io/wavenet_vocoder/
Other
2.3k stars 500 forks source link

Error when using pretrained model with mel-sepcs from "deepvoice3-pytorch" #204

Open ymzlygw opened 3 years ago

ymzlygw commented 3 years ago

Hey, when i try to synthesis wav from mel files , this error occurs:

python synthesis.py --conditional=./output_mel/0_mel.npy ./wavenet_premodel/20180510_mixture_lj_checkpoint_step000320000_ema.pth generated /root/anaconda3/envs/keras/lib/python3.6/site-packages/sklearn/utils/deprecation.py:143: FutureWarning: The sklearn.preprocessing.data module is deprecated in version 0.22 and will be removed in version 0.24. The corresponding classes / functions should instead be imported from sklearn.preprocessing. Anything that cannot be imported from sklearn.preprocessing is now part of the private API. warnings.warn(message, FutureWarning) Using TensorFlow backend. Command line args: {'--conditional': './output_mel/0_mel.npy', '--file-name-suffix': '', '--help': False, '--hparams': '', '--initial-value': None, '--length': '32000', '--max-abs-value': '-1', '--output-html': False, '--preset': None, '--speaker-id': None, '--symmetric-mels': False, '': './wavenet_premodel/20180510_mixture_lj_checkpoint_step000320000_ema.pth', '': 'generated'} Load checkpoint from ./wavenet_premodel/20180510_mixture_lj_checkpoint_step000320000_ema.pth 0%| | 0/20480 [00:00<?, ?it/s] Traceback (most recent call last): File "synthesis.py", line 200, in waveform = wavegen(model, length, c=c, g=speaker_id, initial_value=initial_value, fast=True) File "synthesis.py", line 127, in wavegen log_scale_min=hparams.log_scalemin) File "/root/AI/wavenet_vocoder/wavenet_vocoder/wavenet.py", line 335, in incremental_forward x, h = f.incrementalforward(x, ct, gt) File "/root/AI/wavenet_vocoder/wavenet_vocoder/modules.py", line 135, in incremental_forward return self.forward(x, c, g, True) File "/root/AI/wavenet_vocoder/wavenet_vocoder/modules.py", line 165, in _forward c = _conv1x1_forward(self.conv1x1c, c, isincremental) File "/root/AI/wavenet_vocoder/wavenet_vocoder/modules.py", line 55, in _conv1x1_forward x = conv.incrementalforward(x) File "/root/AI/wavenet_vocoder/wavenet_vocoder/conv.py", line 45, in incremental_forward output = F.linear(input.view(bsz, -1), weight, self.bias) File "/root/anaconda3/envs/keras/lib/python3.6/site-packages/torch/nn/functional.py", line 1370, in linear ret = torch.addmm(bias, input, weight.t()) RuntimeError: size mismatch, m1: [1 x 48], m2: [80 x 512] at /tmp/pip-req-build-808afw3c/aten/src/THC/generic/THCTensorMathBlas.cu:290

The mel(.npy) files is the file generated from "deepvpice3-pytorch" project. I extract the mel-spec (don't use lws to synthesis wav in deepvoice3-pytorch) and want to synthesis with wavenet-vovoder. But this error occurs, can you help me this?

ymzlygw commented 3 years ago

The follow info may be helpful .

ar = np.load(0_mel.npy) ar.shape ==> (80,48)