音档生成问题 - Githubissues

SGneil commented 2 months ago

https://github.com/user-attachments/assets/d7523ba4-b5b4-435c-88b8-f4ca9353391e 这段音讯是训练出模型，使用tts第一次生成出来的音档，为一段时长为0没有声音的音档。

https://github.com/user-attachments/assets/5ecde8fa-f4da-41e8-b0b6-91333df4a312 这是第二次使用tts生成出的音档没有问题，切换到其他角色第一次生成又会出现没声音的情况发生。

我后面研究发现使用该专案训练的ckpt模型没有问题，有问题的是pth模型，因为我有更换这个专案生成的pth模型，更换为sovits原作者 https://github.com/RVC-Boss/GPT-SoVITS 的专案训练出来的pth模型就没这个问题，想询问作者该专案和原作者专案训练出的pth模型有什么差异，怎么会出现这个状况。

ZaVang commented 2 months ago

https://github.com/user-attachments/assets/d7523ba4-b5b4-435c-88b8-f4ca9353391e 这段音讯是训练出模型，使用tts第一次生成出来的音档，为一段时长为0没有声音的音档。

https://github.com/user-attachments/assets/5ecde8fa-f4da-41e8-b0b6-91333df4a312 这是第二次使用tts生成出的音档没有问题，切换到其他角色第一次生成又会出现没声音的情况发生。

我后面研究发现使用该专案训练的ckpt模型没有问题，有问题的是pth模型，因为我有更换这个专案生成的pth模型，更换为sovits原作者 https://github.com/RVC-Boss/GPT-SoVITS 的专案训练出来的pth模型就没这个问题，想询问作者该专案和原作者专案训练出的pth模型有什么差异，怎么会出现这个状况。

看起来是sovits模型训练的有问题，不过我确实没遇到这种情况。按道理来说训练部分的代码和原作者是一样的。你可以在第一次生成的时候看看命令行打印出来哪些信息，只听你的描述我不清楚问题所在。

SGneil commented 2 months ago

抱歉，我发现我没有描述清楚，我详细描述一下，我是将您的专案推理的部分移除，只保留训练的部分，推理的部分是使用这的专案https://www.yuque.com/ xter/zibxlp/nqi871glgxfy717e 我有将我使用的推理及训练程式码上传到这里的https://github.com/SGneil/my-sovits 分为训练及推理两个专案

我训练的步骤，我有将您的训练修改为一个 run.py 档案可以一键训练，再将训练出来的模型放到另一个专案去推理执行pure_api.py 开启推理服务，使用 test.py 去测试

执行 python pure_api.py GSV_Synthesizer config: {'device': 'auto', 'is_half': 'auto', 'models_path': 'trained', 'cnhubert_base_path': 'GPT_SoVITS/pretrained_models/chinese-hubert-base', 'bert_base_path': 'GPT_SoVITS/pretrained_models/chinese-roberta-wwm-ext-large', 'save_prompt_cache': True, 'prompt_cache_dir': 'cache/prompt_cache', 'debug_mode': True} Loading BERT weights from GPT_SoVITS/pretrained_models/chinese-roberta-wwm-ext-large /opt/conda/envs/GPTSoVits/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: clean_up_tokenization_spaces was not set. It will be set to True by default. This behavior will be depracted in transformers v4.45, and will be then set to False by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884 warnings.warn( Loading CNHuBERT weights from GPT_SoVITS/pretrained_models/chinese-hubert-base Some weights of the model checkpoint at GPT_SoVITS/pretrained_models/chinese-hubert-base were not used when initializing HubertModel: ['encoder.pos_conv_embed.conv.weight_g', 'encoder.pos_conv_embed.conv.weight_v']

This IS expected if you are initializing HubertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
This IS NOT expected if you are initializing HubertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of HubertModel were not initialized from the model checkpoint at GPT_SoVITS/pretrained_models/chinese-hubert-base and are newly initialized: ['encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'encoder.pos_conv_embed.conv.parametrizations.weight.original1'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Loading Text2Semantic weights from trained/Seki/Seki.ckpt /home/neil47111202/digital_twin/API_Server/GPT-SoVITS-Inference/GPT_SoVITS/TTS_infer_pack/TTS.py:296: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. dict_s1 = torch.load(weights_path, map_location=self.configs.device) Loading VITS weights from trained/Seki/Seki.pth /home/neil47111202/digital_twin/API_Server/GPT-SoVITS-Inference/GPT_SoVITS/TTS_infer_pack/TTS.py:263: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. dict_s2 = torch.load(weights_path, map_location=self.configs.device) Traceback (most recent call last): File "/home/neil47111202/digital_twin/API_Server/GPT-SoVITS-Inference/pure_api.py", line 90, in tts_synthesizer = TTS_Synthesizer(debug_mode=True) File "/home/neil47111202/digital_twin/API_Server/GPT-SoVITS-Inference/Synthesizers/gsv_fast/GSV_Synthesizer.py", line 71, in init self.load_character(self.default_character) File "/home/neil47111202/digital_twin/API_Server/GPT-SoVITS-Inference/Synthesizers/gsv_fast/GSV_Synthesizer.py", line 174, in load_character self.tts_pipline.init_vits_weights(sovits_path) File "/home/neil47111202/digital_twin/API_Server/GPT-SoVITS-Inference/GPT_SoVITS/TTS_infer_pack/TTS.py", line 263, in init_vits_weights dict_s2 = torch.load(weights_path, map_location=self.configs.device) File "/opt/conda/envs/GPTSoVits/lib/python3.10/site-packages/torch/serialization.py", line 1097, in load return _load( File "/opt/conda/envs/GPTSoVits/lib/python3.10/site-packages/torch/serialization.py", line 1525, in _load result = unpickler.load() File "/opt/conda/envs/GPTSoVits/lib/python3.10/site-packages/torch/serialization.py", line 1515, in find_class return super().find_class(mod_name, name) ModuleNotFoundError: No module named 'utils.config'; 'utils' is not a package

将pth模型更换为原作者训练出的之后才没问题

如果我还有哪部分描述不清楚请再说一下，谢谢

ZaVang commented 2 months ago

抱歉，我发现我没有描述清楚，我详细描述一下，我是将您的专案推理的部分移除，只保留训练的部分，推理的部分是使用这的专案https://www.yuque.com/ xter/zibxlp/nqi871glgxfy717e 我有将我使用的推理及训练程式码上传到这里的https://github.com/SGneil/my-sovits 分为训练及推理两个专案

我训练的步骤，我有将您的训练修改为一个 run.py 档案可以一键训练，再将训练出来的模型放到另一个专案去推理执行pure_api.py 开启推理服务，使用 test.py 去测试

执行 python pure_api.py GSV_Synthesizer config: {'device': 'auto', 'is_half': 'auto', 'models_path': 'trained', 'cnhubert_base_path': 'GPT_SoVITS/pretrained_models/chinese-hubert-base', 'bert_base_path': 'GPT_SoVITS/pretrained_models/chinese-roberta-wwm-ext-large', 'save_prompt_cache': True, 'prompt_cache_dir': 'cache/prompt_cache', 'debug_mode': True} Loading BERT weights from GPT_SoVITS/pretrained_models/chinese-roberta-wwm-ext-large /opt/conda/envs/GPTSoVits/lib/python3.10/site-packages/transformers/tokenization_utils_base.py:1601: FutureWarning: clean_up_tokenization_spaces was not set. It will be set to True by default. This behavior will be depracted in transformers v4.45, and will be then set to False by default. For more details check this issue: huggingface/transformers#31884 warnings.warn( Loading CNHuBERT weights from GPT_SoVITS/pretrained_models/chinese-hubert-base Some weights of the model checkpoint at GPT_SoVITS/pretrained_models/chinese-hubert-base were not used when initializing HubertModel: ['encoder.pos_conv_embed.conv.weight_g', 'encoder.pos_conv_embed.conv.weight_v']

This IS expected if you are initializing HubertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).

This IS NOT expected if you are initializing HubertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). Some weights of HubertModel were not initialized from the model checkpoint at GPT_SoVITS/pretrained_models/chinese-hubert-base and are newly initialized: ['encoder.pos_conv_embed.conv.parametrizations.weight.original0', 'encoder.pos_conv_embed.conv.parametrizations.weight.original1'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. Loading Text2Semantic weights from trained/Seki/Seki.ckpt /home/neil47111202/digital_twin/API_Server/GPT-SoVITS-Inference/GPT_SoVITS/TTS_infer_pack/TTS.py:296: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. dict_s1 = torch.load(weights_path, map_location=self.configs.device) Loading VITS weights from trained/Seki/Seki.pth /home/neil47111202/digital_twin/API_Server/GPT-SoVITS-Inference/GPT_SoVITS/TTS_infer_pack/TTS.py:263: FutureWarning: You are using torch.load with weights_only=False (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for weights_only will be flipped to True. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via torch.serialization.add_safe_globals. We recommend you start setting weights_only=True for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature. dict_s2 = torch.load(weights_path, map_location=self.configs.device) Traceback (most recent call last): File "/home/neil47111202/digital_twin/API_Server/GPT-SoVITS-Inference/pure_api.py", line 90, in tts_synthesizer = TTS_Synthesizer(debug_mode=True) File "/home/neil47111202/digital_twin/API_Server/GPT-SoVITS-Inference/Synthesizers/gsv_fast/GSV_Synthesizer.py", line 71, in init self.load_character(self.default_character) File "/home/neil47111202/digital_twin/API_Server/GPT-SoVITS-Inference/Synthesizers/gsv_fast/GSV_Synthesizer.py", line 174, in load_character self.tts_pipline.init_vits_weights(sovits_path) File "/home/neil47111202/digital_twin/API_Server/GPT-SoVITS-Inference/GPT_SoVITS/TTS_infer_pack/TTS.py", line 263, in init_vits_weights dict_s2 = torch.load(weights_path, map_location=self.configs.device) File "/opt/conda/envs/GPTSoVits/lib/python3.10/site-packages/torch/serialization.py", line 1097, in load return _load( File "/opt/conda/envs/GPTSoVits/lib/python3.10/site-packages/torch/serialization.py", line 1525, in _load result = unpickler.load() File "/opt/conda/envs/GPTSoVits/lib/python3.10/site-packages/torch/serialization.py", line 1515, in find_class return super().find_class(mod_name, name) ModuleNotFoundError: No module named 'utils.config'; 'utils' is not a package

将pth模型更换为原作者训练出的之后才没问题

如果我还有哪部分描述不清楚请再说一下，谢谢

我估计是因为我的代码模块和你用的infer的不太一样，我将Hparams移到了utils.config里，但是原来的在utils文件下，所以导致了这个问题。你可以将Hparams移到对应的文件里试试看

SGneil commented 2 months ago

感谢你，你是对的按照你的方法加入utils档案就可以了，太谢谢你了

ZaVang / GPT-SoVits

音档生成问题 #10