svc-develop-team / so-vits-svc

SoftVC VITS Singing Voice Conversion
GNU Affero General Public License v3.0
25.26k stars 4.74k forks source link

[Help]: 如何使用自己的预训练模型 #358

Closed gandolfxu closed 1 year ago

gandolfxu commented 1 year ago

请勾选下方的确认框。

系统平台版本号

Ubuntu 18.04

GPU 型号

V100 32G

Python版本

3.9.13

PyTorch版本

1.12.1

sovits分支

4.0(默认)

数据集来源(用于判断数据集质量)

内部TTS数据

出现问题的环节或执行的命令

python train.py -c configs/config.json -m 16k

问题描述

4.1-Stable版本

  1. 使用103人的数据集训练一个基础模型
  2. 将G*.pth和D*.pth重命名为G_0.pth和D_0.pth,并放到logs/16k目录下
  3. 重新使用1个人的数据集进行finetune,错误信息如下。

日志

INFO:16k:{'train': {'log_interval': 200, 'eval_interval': 1000, 'seed': 1234, 'epochs': 10000, 'learning_rate': 0.0001, 'betas': [0.8, 0.99], 'eps': 1e-09, 'batch_size': 32, 'fp16_run': False, 'half_type': 'fp16', 'lr_decay': 0.999875, 'segment_size': 4480, 'init_lr_ratio': 1, 'warmup_epochs': 0, 'c_mel': 45, 'c_kl': 1.0, 'use_sr': False, 'max_speclen': 512, 'port': '8001', 'keep_ckpts': 3, 'all_in_mem': False, 'vol_aug': False}, 'data': {'training_files': 'filelists/train.txt', 'validation_files': 'filelists/val.txt', 'max_wav_value': 32768.0, 'sampling_rate': 16000, 'filter_length': 1024, 'hop_length': 320, 'win_length': 1024, 'n_mel_channels': 80, 'mel_fmin': 0.0, 'mel_fmax': 8000, 'unit_interpolate_mode': 'nearest'}, 'model': {'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0.1, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [10, 8, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [20, 16, 4, 4], 'n_layers_q': 3, 'n_flow_layer': 4, 'use_spectral_norm': False, 'gin_channels': 768, 'ssl_dim': 768, 'n_speakers': 1, 'vocoder_name': 'nsf-hifigan', 'speech_encoder': 'vec768l12', 'speaker_embedding': False, 'vol_embedding': False, 'use_depthwise_conv': False, 'flow_share_parameter': False, 'use_automatic_f0_prediction': True}, 'spk': {'nxns': 0}, 'model_dir': './logs/16k'}
libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'.
libibverbs: Warning: couldn't open config directory '/etc/libibverbs.d'.
./logs/16k/G_0.pth
./logs/16k/G_0.pth
error, emb_g.weight is not in the checkpoint
INFO:16k:emb_g.weight is not in the checkpoint
error, emb_g.weight is not in the checkpoint
load 
INFO:16k:Loaded checkpoint './logs/16k/G_0.pth' (iteration 220)
./logs/16k/D_0.pth
load 
./logs/16k/D_0.pth
load 
INFO:16k:Loaded checkpoint './logs/16k/D_0.pth' (iteration 220)
load 
./logs/16k/D_0.pth
./logs/16k/D_0.pth
/home/notebook/code/personal/so-vits-svc/.venv/lib/python3.9/site-packages/torch/autograd/__init__.py:173: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed.  This is not an error, but may impair performance.
grad.sizes() = [64, 1, 4], strides() = [4, 1, 1]
bucket_view.sizes() = [64, 1, 4], strides() = [4, 4, 1] (Triggered internally at  ../torch/csrc/distributed/c10d/reducer.cpp:326.)
  Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
/home/notebook/code/personal/so-vits-svc/.venv/lib/python3.9/site-packages/torch/autograd/__init__.py:173: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed.  This is not an error, but may impair performance.
grad.sizes() = [64, 1, 4], strides() = [4, 1, 1]
bucket_view.sizes() = [64, 1, 4], strides() = [4, 4, 1] (Triggered internally at  ../torch/csrc/distributed/c10d/reducer.cpp:326.)
  Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
Traceback (most recent call last):
  File "/home/notebook/code/personal/so-vits-svc/train.py", line 329, in <module>
    main()
  File "/home/notebook/code/personal/so-vits-svc/train.py", line 44, in main
    mp.spawn(run, nprocs=n_gpus, args=(n_gpus, hps,))
  File "/home/notebook/code/personal/so-vits-svc/.venv/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 240, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/home/notebook/code/personal/so-vits-svc/.venv/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 198, in start_processes
    while not context.join():
  File "/home/notebook/code/personal/so-vits-svc/.venv/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 160, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException: 

-- Process 1 terminated with the following error:
Traceback (most recent call last):
  File "/home/notebook/code/personal/so-vits-svc/.venv/lib/python3.9/site-packages/torch/multiprocessing/spawn.py", line 69, in _wrap
    fn(i, *args)
  File "/home/notebook/code/personal/so-vits-svc/train.py", line 128, in run
    train_and_evaluate(rank, epoch, hps, [net_g, net_d], [optim_g, optim_d], [scheduler_g, scheduler_d], scaler,
  File "/home/notebook/code/personal/so-vits-svc/train.py", line 212, in train_and_evaluate
    scaler.step(optim_g)
  File "/home/notebook/code/personal/so-vits-svc/.venv/lib/python3.9/site-packages/torch/cuda/amp/grad_scaler.py", line 310, in step
    return optimizer.step(*args, **kwargs)
  File "/home/notebook/code/personal/so-vits-svc/.venv/lib/python3.9/site-packages/torch/optim/lr_scheduler.py", line 65, in wrapper
    return wrapped(*args, **kwargs)
  File "/home/notebook/code/personal/so-vits-svc/.venv/lib/python3.9/site-packages/torch/optim/optimizer.py", line 113, in wrapper
    return func(*args, **kwargs)
  File "/home/notebook/code/personal/so-vits-svc/.venv/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/notebook/code/personal/so-vits-svc/.venv/lib/python3.9/site-packages/torch/optim/adamw.py", line 161, in step
    adamw(params_with_grad,
  File "/home/notebook/code/personal/so-vits-svc/.venv/lib/python3.9/site-packages/torch/optim/adamw.py", line 218, in adamw
    func(params,
  File "/home/notebook/code/personal/so-vits-svc/.venv/lib/python3.9/site-packages/torch/optim/adamw.py", line 311, in _single_tensor_adamw
    param.addcdiv_(exp_avg, denom, value=-step_size)
RuntimeError: output with shape [1, 768] doesn't match the broadcast shape [103, 768]

截图so-vits-svclogs/44k文件夹并粘贴到此处

image

补充说明

No response

ylzz1997 commented 1 year ago

自己做预训练模型需要裁剪spk_emb,可参考代码:

import torch
G = "G_228000.pth"
D = "D_228000.pth"

a = torch.load(G)

a["iteration"] = 0
a["optimizer"] = None
a["learning_rate"] = 0.0001
del a["model"]["emb_g.weight"]

torch.save(a,f"clean_{G}")

a = torch.load(D)

a["iteration"] = 0
a["optimizer"] = None
a["learning_rate"] = 0.0001

torch.save(a,f"clean_{D}")
HuanLinOTO commented 1 year ago

close as compeleted