svc-develop-team / so-vits-svc

SoftVC VITS Singing Voice Conversion
GNU Affero General Public License v3.0
25.26k stars 4.74k forks source link

对浅扩散模型在 whisper-ppg-large 作为编码器情况下的训练做了补充说明 #335

Open hxdnshx opened 1 year ago

hxdnshx commented 1 year ago

在直接使用从 Diffusion-SVC 下载的预训练模型进行训练的场合,会出现以下错误:

 [*] restoring model from logs/44k/diffusion\model_0.pt
Traceback (most recent call last):
  File "train_diff.py", line 55, in <module>
    initial_global_step, model, optimizer = utils.load_model(args.env.expdir, model, optimizer, device=args.device)
  File "H:\so-vits-svc\diffusion\logger\utils.py", line 124, in load_model
    model.load_state_dict(ckpt['model'], strict=False)
  File "H:\so-vits-svc\venv\lib\site-packages\torch\nn\modules\module.py", line 1671, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Unit2Mel:
        size mismatch for unit_embed.weight: copying a param with shape torch.Size([256, 1024]) from checkpoint, the shape in current model is torch.Size([256, 1280]).
KakaruHayate commented 1 year ago

你使用的是whisper-ppg-medium的预训练模型,对应本项目的whisper-ppg encoder whisper-ppg-large目前还没有预训练模型

hxdnshx commented 1 year ago

你使用的是whisper-ppg-medium的预训练模型,对应本项目的whisper-ppg encoder whisper-ppg-large目前还没有预训练模型

是的,于是在这个pr里面补充了切换为从头开始训练的说明。