RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
MIT License
35.69k stars 4.07k forks source link

今天更新代码后三连第三步爆模型加载失败 #1400

Closed HsiangLeekwok closed 3 months ago

HsiangLeekwok commented 3 months ago

没更新前是正常的……

QYWX_20240805220822

"D:\ProgramData\anaconda3\envs\sovits\python.exe" GPT_SoVITS/prepare_datasets/3-get-semantic.py
D:\ProgramData\anaconda3\envs\sovits\lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Traceback (most recent call last):
  File "E:\backup\github\GPT-SoVITS\GPT_SoVITS\prepare_datasets\3-get-semantic.py", line 65, in <module>
    vq_model.load_state_dict(
  File "D:\ProgramData\anaconda3\envs\sovits\lib\site-packages\torch\nn\modules\module.py", line 2152, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for SynthesizerTrn:
        size mismatch for enc_p.text_embedding.weight: copying a param with shape torch.Size([322, 192]) from checkpoint, the shape in current model is torch.Size([732, 192]).
        size mismatch for ref_enc.spectral.0.fc.weight: copying a param with shape torch.Size([128, 1025]) from checkpoint, the shape in current model is torch.Size([128, 704]).
Traceback (most recent call last):
  File "E:\backup\github\GPT-SoVITS\webui.py", line 680, in open1abc
    with open(semantic_path, "r",encoding="utf8") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'logs/peter/6-name2semantic-0.tsv'
错误: 没有找到进程 "17916"。
HsiangLeekwok commented 3 months ago

知道问题在哪了:models里面所有的类都默认了version="v2"

https://github.com/RVC-Boss/GPT-SoVITS/blob/4e34814c701b81aef8a3931b3ab5921494be7ed0/GPT_SoVITS/prepare_datasets/3-get-semantic.py#L49

这里调用时增加version="v1"参数就正常了

    vq_model = SynthesizerTrn(
        hps.data.filter_length // 2 + 1,
        hps.train.segment_size // hps.data.hop_length,
        n_speakers=hps.data.n_speakers,
        version="v1", # here
        **hps.model
    )
XXXXRT666 commented 3 months ago

下一个commit中会修复

HsiangLeekwok commented 3 months ago

下一个commit中会修复

好的,大佬辛苦了。 后续训练的时候还会报 not in self.phoneme_data ,导致后续训练全部出错。

KamioRinn commented 3 months ago

好的,大佬辛苦了。 后续训练的时候还会报 not in self.phoneme_data ,导致后续训练全部出错。

同个原因,下一次合并中修复

RVC-Boss commented 3 months ago

修复了,试试看

HsiangLeekwok commented 3 months ago

修复了,试试看

可以了。Good!!

(❤️ ω ❤️)

bigbigtooth commented 3 months ago

我修改了文件中的version为v1,问题依然存在,求解决办法! 我用的平台是Macbook Pro M3Max

WhaleFell commented 3 months ago

fast_inference_ 分支中也存在这个问题.

JosenJin commented 1 month ago

在运行api_v2.py时这个问题依然存在?请问怎么解决呀?

Traceback (most recent call last): File "D:\ai\josencomfyui\GPT-SoVITS\api_v2.py", line 144, in tts_pipeline = TTS(tts_config) ^^^^^^^^^^^^^^^ File "D:\ai\josencomfyui\GPT-SoVITS\GPT_SoVITS\TTS_infer_pack\TTS.py", line 252, in init self._init_models() File "D:\ai\josencomfyui\GPT-SoVITS\GPT_SoVITS\TTS_infer_pack\TTS.py", line 278, in _init_models self.init_vits_weights(self.configs.vits_weights_path) File "D:\ai\josencomfyui\GPT-SoVITS\GPT_SoVITS\TTS_infer_pack\TTS.py", line 337, in init_vits_weights vits_model.load_state_dict(dict_s2["weight"], strict=False) File "D:\ai\anaconda3\envs\comfyui\Lib\site-packages\torch\nn\modules\module.py", line 2215, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for SynthesizerTrn: size mismatch for enc_p.text_embedding.weight: copying a param with shape torch.Size([732, 192]) from checkpoint, the shape in current model is torch.Size([322, 192]). size mismatch for ref_enc.spectral.0.fc.weight: copying a param with shape torch.Size([128, 704]) from checkpoint, the shape in current model is torch.Size([128, 1025]).

XXXXRT666 commented 1 month ago

在运行api_v2.py时这个问题依然存在?请问怎么解决呀?

使用V2代码