auspicious3000 / SpeechSplit

Unsupervised Speech Decomposition Via Triple Information Bottleneck
http://arxiv.org/abs/2004.11284
MIT License
636 stars 92 forks source link

RuntimeError: Error(s) in loading state_dict for WaveNet #26

Closed vlad-i closed 3 years ago

vlad-i commented 3 years ago

Hello,

Thanks for uploading the code! I wanted to let you know I'm having some issues with running the code from the demo, getting this error:

RuntimeError: Error(s) in loading state_dict for WaveNet:
    Missing key(s) in state_dict: "upsample_net.conv_in.weight", "upsample_net.upsample.up_layers.1.weight_g", "upsample_net.upsample.up_layers.1.weight_v", "upsample_net.upsample.up_layers.3.weight_g", "upsample_net.upsample.up_layers.3.weight_v", "upsample_net.upsample.up_layers.5.weight_g", "upsample_net.upsample.up_layers.5.weight_v", "upsample_net.upsample.up_layers.7.weight_g", "upsample_net.upsample.up_layers.7.weight_v". 
    Unexpected key(s) in state_dict: "upsample_conv.0.bias", "upsample_conv.0.weight_g", "upsample_conv.0.weight_v", "upsample_conv.2.bias", "upsample_conv.2.weight_g", "upsample_conv.2.weight_v", "upsample_conv.4.bias", "upsample_conv.4.weight_g", "upsample_conv.4.weight_v", "upsample_conv.6.bias", "upsample_conv.6.weight_g", "upsample_conv.6.weight_v", "conv_layers.0.conv1x1c.bias", "conv_layers.1.conv1x1c.bias", "conv_layers.2.conv1x1c.bias", "conv_layers.3.conv1x1c.bias", "conv_layers.4.conv1x1c.bias", "conv_layers.5.conv1x1c.bias", "conv_layers.6.conv1x1c.bias", "conv_layers.7.conv1x1c.bias", "conv_layers.8.conv1x1c.bias", "conv_layers.9.conv1x1c.bias", "conv_layers.10.conv1x1c.bias", "conv_layers.11.conv1x1c.bias", "conv_layers.12.conv1x1c.bias", "conv_layers.13.conv1x1c.bias", "conv_layers.14.conv1x1c.bias", "conv_layers.15.conv1x1c.bias", "conv_layers.16.conv1x1c.bias", "conv_layers.17.conv1x1c.bias", "conv_layers.18.conv1x1c.bias", "conv_layers.19.conv1x1c.bias", "conv_layers.20.conv1x1c.bias", "conv_layers.21.conv1x1c.bias", "conv_layers.22.conv1x1c.bias", "conv_layers.23.conv1x1c.bias". 

I used to have size mismatches as well, but then I edited these rows from inside the wavenet_vocoder repo:

residual_channels=512,
gate_channels=512,  # split into 2 gropus internally for gated activation
skip_out_channels=256,

Maybe it's something obvious for you, thank you for publishing your code and of course your time, much obliged.

FurkanGozukara commented 3 years ago

@vlad-i could you check my problem it seems like you have progressed more than me

https://github.com/auspicious3000/SpeechSplit/issues/28

vlad-i commented 3 years ago

@FurkanGozukara I'm happy you're also looking at this repo, though I'm afraid you've progressed more than I; I've just dabbled with the inference code with the provided pretrained weights.

I was about to try training as well.

vlad-i commented 3 years ago

https://github.com/auspicious3000/SpeechSplit/pull/23 using this PR instead of the master code fixed this for me, closing this issue for now.