lmnt-com / diffwave

DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
Apache License 2.0
767 stars 112 forks source link

Error occured when i try to reload pretrained model to inference #45

Closed Honee-W closed 1 year ago

Honee-W commented 1 year ago

I downloaded the checkpoint offered in readme, then i used the Inference API, the params remained unchanged. But an error occuered, Here's the code and the error info:

Code: from inference import predict as diffwave_predict model_dir = './diffwave-ljspeech-22kHz-1000578.pt' spectrogram = torch.from_numpy(np.load('./upsample_test.wav.spec.npy')) if len(spectrogram.shape) == 2: spectrogram = spectrogram.unsqueeze(0) audio, sample_rate = diffwave_predict(spectrogram, model_dir, fast_sampling=True) sf.write('./output.wav', audio, sample_rate)

Error info: Traceback (most recent call last): File "test.py", line 6, in audio, sample_rate = diffwave_predict(spectrogram, model_dir, fast_sampling=True) File "/home/work_nfs6/zqwang/workspace/voicefilter/model/diffusion_model/inference.py", line 40, in predict model.load_state_dict(checkpoint['model']) File "/home/environment/zqwang/anaconda3/envs/py38pt17/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for DiffWave: Unexpected key(s) in state_dict: "spectrogram_upsampler.conv1.weight", "spectrogram_upsampler.conv1.bias", "spectrogram_upsampler.conv2.weight", "spectrogram_upsampler.conv2.bias", "residual_layers.0.conditioner_projection.weight", "residual_layers.0.conditioner_projection.bias", "residual_layers.1.conditioner_projection.weight", "residual_layers.1.conditioner_projection.bias", "residual_layers.2.conditioner_projection.weight", "residual_layers.2.conditioner_projection.bias", "residual_layers.3.conditioner_projection.weight", "residual_layers.3.conditioner_projection.bias", "residual_layers.4.conditioner_projection.weight", "residual_layers.4.conditioner_projection.bias", "residual_layers.5.conditioner_projection.weight", "residual_layers.5.conditioner_projection.bias", "residual_layers.6.conditioner_projection.weight", "residual_layers.6.conditioner_projection.bias", "residual_layers.7.conditioner_projection.weight", "residual_layers.7.conditioner_projection.bias", "residual_layers.8.conditioner_projection.weight", "residual_layers.8.conditioner_projection.bias", "residual_layers.9.conditioner_projection.weight", "residual_layers.9.conditioner_projection.bias", "residual_layers.10.conditioner_projection.weight", "residual_layers.10.conditioner_projection.bias", "residual_layers.11.conditioner_projection.weight", "residual_layers.11.conditioner_projection.bias", "residual_layers.12.conditioner_projection.weight", "residual_layers.12.conditioner_projection.bias", "residual_layers.13.conditioner_projection.weight", "residual_layers.13.conditioner_projection.bias", "residual_layers.14.conditioner_projection.weight", "residual_layers.14.conditioner_projection.bias", "residual_layers.15.conditioner_projection.weight", "residual_layers.15.conditioner_projection.bias", "residual_layers.16.conditioner_projection.weight", "residual_layers.16.conditioner_projection.bias", "residual_layers.17.conditioner_projection.weight", "residual_layers.17.conditioner_projection.bias", "residual_layers.18.conditioner_projection.weight", "residual_layers.18.conditioner_projection.bias", "residual_layers.19.conditioner_projection.weight", "residual_layers.19.conditioner_projection.bias", "residual_layers.20.conditioner_projection.weight", "residual_layers.20.conditioner_projection.bias", "residual_layers.21.conditioner_projection.weight", "residual_layers.21.conditioner_projection.bias", "residual_layers.22.conditioner_projection.weight", "residual_layers.22.conditioner_projection.bias", "residual_layers.23.conditioner_projection.weight", "residual_layers.23.conditioner_projection.bias", "residual_layers.24.conditioner_projection.weight", "residual_layers.24.conditioner_projection.bias", "residual_layers.25.conditioner_projection.weight", "residual_layers.25.conditioner_projection.bias", "residual_layers.26.conditioner_projection.weight", "residual_layers.26.conditioner_projection.bias", "residual_layers.27.conditioner_projection.weight", "residual_layers.27.conditioner_projection.bias", "residual_layers.28.conditioner_projection.weight", "residual_layers.28.conditioner_projection.bias", "residual_layers.29.conditioner_projection.weight", "residual_layers.29.conditioner_projection.bias".

Thanks in advance! Looking forward to your reply!

Honee-W commented 1 year ago

It seems that the provided checkpoint is a conditional model, solve the problem by setting the unconditional in the params.py to False. I'm using the pretrained model for speech separation, sorry for being so amateur.