babysor / MockingBird

🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
Other
34.88k stars 5.18k forks source link

使用mandarin_200k.pt模型报错 #886

Open africa1207 opened 1 year ago

africa1207 commented 1 year ago

Summary[问题简述(一句话)] 使用mandarin_200k.pt模型报错,使用pretrained-11-7-21_75k.pt正常,由于安装依赖monotonic-align==0.0.3报错过不了,删除了==0.0.3成功安装

Env & To Reproduce[复现与环境] 环境:wsl2-ubuntu22.04,python3.9.0和python3.10都试过 代码版本:2023 年 3 月 7 日版本 模型:mandarin_200k.pt

Screenshots[截图(如有)] image

RuntimeError: Error(s) in loading state_dict for Tacotron: size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([70, 512]) from checkpoint, the shape in current model is torch.Size([75, 512]). size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 512]) from checkpoint, the shape in current model is torch.Size([128, 1024]). size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 768]) from checkpoint, the shape in current model is torch.Size([384, 1280]). size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 640]) from checkpoint, the shape in current model is torch.Size([1024, 1152]). size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 1536]) from checkpoint, the shape in current model is torch.Size([1, 2048]).

**使用v0.0.1版本并使用mandarin_200k.pt模型同样报错

Synthesizer using device: cpu
using synthesizer model: synthesizer/saved_models/mandarin_200k.pt
Trainable Parameters: 31.951M
[2023-04-21 16:19:44,685] ERROR in app: Exception on /api/synthesize [POST]
Traceback (most recent call last):
  File "/home/liuxixigua/.local/lib/python3.9/site-packages/flask/app.py", line 2528, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/liuxixigua/.local/lib/python3.9/site-packages/flask/app.py", line 1825, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/liuxixigua/.local/lib/python3.9/site-packages/flask_restx/api.py", line 674, in error_router
    return original_handler(e)
  File "/home/liuxixigua/.local/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/liuxixigua/.local/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/home/liuxixigua/MockingBird/web/__init__.py", line 108, in synthesize
    specs = current_synt.synthesize_spectrograms(texts, embeds)
  File "/home/liuxixigua/MockingBird/synthesizer/inference.py", line 87, in synthesize_spectrograms
    self.load()
  File "/home/liuxixigua/MockingBird/synthesizer/inference.py", line 65, in load
    self._model.load(self.model_fpath)
  File "/home/liuxixigua/MockingBird/synthesizer/models/tacotron.py", line 523, in load
    self.load_state_dict(checkpoint["model_state"], strict=False)
  File "/home/liuxixigua/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Tacotron:
        size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([70, 512]) from checkpoint, the shape in current model is torch.Size([75, 512]).
127.0.0.1 - - [2023-04-21 16:19:44] "POST /api/synthesize HTTP/1.1" 500 401 3.199693

**使用v0.0.1版本并使用mandarin_200k.pt模型同样报错

Synthesizer using device: cpu
using synthesizer model: synthesizer/saved_models/pretrained-11-7-21_75k.pt
Trainable Parameters: 31.951M
[2023-04-21 16:24:38,117] ERROR in app: Exception on /api/synthesize [POST]
Traceback (most recent call last):
  File "/home/liuxixigua/.local/lib/python3.9/site-packages/flask/app.py", line 2528, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/liuxixigua/.local/lib/python3.9/site-packages/flask/app.py", line 1825, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/liuxixigua/.local/lib/python3.9/site-packages/flask_restx/api.py", line 674, in error_router
    return original_handler(e)
  File "/home/liuxixigua/.local/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/liuxixigua/.local/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/home/liuxixigua/MockingBird/web/__init__.py", line 108, in synthesize
    specs = current_synt.synthesize_spectrograms(texts, embeds)
  File "/home/liuxixigua/MockingBird/synthesizer/inference.py", line 87, in synthesize_spectrograms
    self.load()
  File "/home/liuxixigua/MockingBird/synthesizer/inference.py", line 65, in load
    self._model.load(self.model_fpath)
  File "/home/liuxixigua/MockingBird/synthesizer/models/tacotron.py", line 523, in load
    self.load_state_dict(checkpoint["model_state"], strict=False)
  File "/home/liuxixigua/.local/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Tacotron:
        size mismatch for encoder_proj.weight: copying a param with shape torch.Size([128, 1024]) from checkpoint, the shape in current model is torch.Size([128, 512]).
        size mismatch for gst.stl.attention.W_query.weight: copying a param with shape torch.Size([512, 512]) from checkpoint, the shape in current model is torch.Size([512, 256]).
        size mismatch for decoder.attn_rnn.weight_ih: copying a param with shape torch.Size([384, 1280]) from checkpoint, the shape in current model is torch.Size([384, 768]).
        size mismatch for decoder.rnn_input.weight: copying a param with shape torch.Size([1024, 1152]) from checkpoint, the shape in current model is torch.Size([1024, 640]).
        size mismatch for decoder.stop_proj.weight: copying a param with shape torch.Size([1, 2048]) from checkpoint, the shape in current model is torch.Size([1, 1536]).
127.0.0.1 - - [2023-04-21 16:24:38] "POST /api/synthesize HTTP/1.1" 500 401 3.134052
babysor commented 1 year ago

代码版本要切换到0.0.3

arthurwu4work commented 1 year ago

Hi @babysor ,

請問0.03版在哪 ? 我在tags裡面只有看到0.01

謝謝

babysor commented 1 year ago

Hi @babysor ,

請問0.03版在哪 ? 我在tags裡面只有看到0.01

謝謝

0.01也可

Zhang-2000 commented 1 year ago

Hi @babysor , 請問0.03版在哪 ? 我在tags裡面只有看到0.01 謝謝

0.01也可

v0.0.1 encoder的尺寸不对 size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([70, 512]) from checkpoint, the shape in current model is torch.Size([75, 512])