mozilla / TTS

:robot: :speech_balloon: Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Mozilla Public License 2.0
9.37k stars 1.25k forks source link

Cannot start server due to size mismatches #164

Closed dr-slurp closed 5 years ago

dr-slurp commented 5 years ago

Hey all,

I'm trying to use the Best: iter-185K model, and I have checked out the corresponding commit history for that model (git checkout db7f3d3) as was prescribed here: https://github.com/mozilla/TTS/blob/master/server/README.md

I was able to set everything up with no errors, and put the proper paths to the models in the config.json file in the server directory.

When I try to start the server

python server/server.py -c server/conf.json

I get tons of size mismatch issues:

`> Loading model ... | > model config: /Users/joshuaeisenberg/tts_test/TTS/models/best_model/config.json | > model file: /Users/joshuaeisenberg/tts_test/TTS/models/best_model/best_model.pth.tar

Setting up Audio Processor... | > fft size: 2048, hop length: 275, win length: 1102 | > Audio Processor attributes. | > bits:None | > sample_rate:22050 | > num_mels:80 | > min_level_db:-100 | > frame_shift_ms:12.5 | > frame_length_ms:50 | > ref_level_db:20 | > num_freq:1025 | > power:1.5 | > preemphasis:0.98 | > griffin_lim_iters:60 | > signal_norm:True | > symmetric_norm:False | > mel_fmin:0 | > mel_fmax:None | > max_norm:1.0 | > clip_norm:True | > do_trim_silence:True | > n_fft:2048 | > hop_length:275 | > win_length:1102 | > Number of characters : 256 Traceback (most recent call last): File "server/server.py", line 16, in config.model_config, config.use_cuda) File "/Users/joshuaeisenberg/tts_test/TTS/server/synthesizer.py", line 34, in load_model self.model.load_state_dict(cp['model']) File "/miniconda3/lib/python3.7/site-packages/torch-1.0.1.post2-py3.7-macosx-10.7-x86_64.egg/torch/nn/modules/module.py", line 769, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for embedding.weight: copying a param with shape torch.Size([61, 256]) from checkpoint, the shape in current model is torch.Size([256, 1025]). size mismatch for encoder.prenet.layers.0.weight: copying a param with shape torch.Size([256, 256]) from checkpoint, the shape in current model is torch.Size([256, 1025]). size mismatch for decoder.prenet.layers.0.weight: copying a param with shape torch.Size([256, 400]) from checkpoint, the shape in current model is torch.Size([256, 10]). size mismatch for decoder.proj_to_mel.weight: copying a param with shape torch.Size([160, 256]) from checkpoint, the shape in current model is torch.Size([10, 256]). size mismatch for decoder.proj_to_mel.bias: copying a param with shape torch.Size([160]) from checkpoint, the shape in current model is torch.Size([10]). size mismatch for decoder.memory_init.weight: copying a param with shape torch.Size([1, 400]) from checkpoint, the shape in current model is torch.Size([1, 10]). size mismatch for decoder.stopnet.linear.weight: copying a param with shape torch.Size([1, 416]) from checkpoint, the shape in current model is torch.Size([1, 266]). size mismatch for postnet.cbhg.conv1d_banks.0.conv1d.weight: copying a param with shape torch.Size([128, 80, 1]) from checkpoint, the shape in current model is torch.Size([128, 2, 1]). size mismatch for postnet.cbhg.conv1d_banks.1.conv1d.weight: copying a param with shape torch.Size([128, 80, 2]) from checkpoint, the shape in current model is torch.Size([128, 2, 2]). size mismatch for postnet.cbhg.conv1d_banks.2.conv1d.weight: copying a param with shape torch.Size([128, 80, 3]) from checkpoint, the shape in current model is torch.Size([128, 2, 3]). size mismatch for postnet.cbhg.conv1d_banks.3.conv1d.weight: copying a param with shape torch.Size([128, 80, 4]) from checkpoint, the shape in current model is torch.Size([128, 2, 4]). size mismatch for postnet.cbhg.conv1d_banks.4.conv1d.weight: copying a param with shape torch.Size([128, 80, 5]) from checkpoint, the shape in current model is torch.Size([128, 2, 5]). size mismatch for postnet.cbhg.conv1d_banks.5.conv1d.weight: copying a param with shape torch.Size([128, 80, 6]) from checkpoint, the shape in current model is torch.Size([128, 2, 6]). size mismatch for postnet.cbhg.conv1d_banks.6.conv1d.weight: copying a param with shape torch.Size([128, 80, 7]) from checkpoint, the shape in current model is torch.Size([128, 2, 7]). size mismatch for postnet.cbhg.conv1d_banks.7.conv1d.weight: copying a param with shape torch.Size([128, 80, 8]) from checkpoint, the shape in current model is torch.Size([128, 2, 8]). size mismatch for postnet.cbhg.conv1d_projections.1.conv1d.weight: copying a param with shape torch.Size([80, 256, 3]) from checkpoint, the shape in current model is torch.Size([2, 256, 3]). size mismatch for postnet.cbhg.conv1d_projections.1.bn.weight: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([2]). size mismatch for postnet.cbhg.conv1d_projections.1.bn.bias: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([2]). size mismatch for postnet.cbhg.conv1d_projections.1.bn.running_mean: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([2]). size mismatch for postnet.cbhg.conv1d_projections.1.bn.running_var: copying a param with shape torch.Size([80]) from checkpoint, the shape in current model is torch.Size([2]). size mismatch for postnet.cbhg.pre_highway.weight: copying a param with shape torch.Size([128, 80]) from checkpoint, the shape in current model is torch.Size([128, 2]). size mismatch for last_linear.0.weight: copying a param with shape torch.Size([1025, 256]) from checkpoint, the shape in current model is torch.Size([80, 256]). size mismatch for last_linear.0.bias: copying a param with shape torch.Size([1025]) from checkpoint, the shape in current model is torch.Size([80]).`

This seems to be the same problem as https://github.com/mozilla/TTS/issues/154 and https://github.com/mozilla/TTS/issues/150. I can't find any solutions on those threads.

Are there any models that don't have these mismatch issues?

Thanks for any help. I'm appreciative especially because this is a great free open source resource. Just wanted to see if anyone had any advice.

erogol commented 5 years ago

I guess you missed this https://github.com/mozilla/TTS/issues/154#issuecomment-483165079

Otherwise, it is always the same. Use the right commit version for the released model.

And thanks for good words :)

dr-slurp commented 5 years ago

Okay, so if I want to use iter-185k I first clone the current repo git clone https://github.com/mozilla/TTS.git But how do I get branch fix_db7f3d3? I wasn't able to use git checkout fix_db7f3d3

Is there a more canoncial way to get the fix?

Thanks :)

dr-slurp commented 5 years ago

For anyone wondering how to clone this branch:

git clone --single-branch --branch fix_db7f3d3 https://github.com/mozilla/TTS.git

It works btw. @erogol , thank you so much. It works, and now I'm having so much fun with it