Closed selfboot closed 9 months ago
看了下代码:
def __init__(
self,
phoneme_path: str,
semantic_path: str,
max_sample: int = None,
max_sec: int = 100,
pad_val: int = 1024,
# min value of phoneme/sec
min_ps_ratio: int = 3,
# max value of phoneme/sec
max_ps_ratio: int = 25,
) -> None:
super().__init__()
self.semantic_data = pd.read_csv(
semantic_path, delimiter="\t", encoding="utf-8"
)
这里读到的数据为空,前面日志其实也打印了 semantic_data_len: 0
这里 semantic_path 通过 yml 配置文件:TEMP/tmp_s1.yaml 读取, 应该是:
看了下确实是空的:
$ cat logs/first/6-name2semantic.tsv
item_name semantic_audio
清理掉训练任务预处理的结果,然后重新生成数据集格式化文件,看了下文件对了。
(GPTSoVits) ➜ first git:(main) ✗ ls -alh
total 36K
drwxr-xr-x 6 zhao users 4.0K Jan 19 10:40 .
drwxr-xr-x 3 zhao users 4.0K Jan 19 10:40 ..
-rw-r--r-- 1 zhao users 2.3K Jan 19 10:40 2-name2text.txt
drwxr-xr-x 2 zhao users 4.0K Jan 19 10:40 3-bert
drwxr-xr-x 2 zhao users 4.0K Jan 19 10:40 4-cnhubert
drwxr-xr-x 2 zhao users 4.0K Jan 19 10:40 5-wav32k
-rw-r--r-- 1 zhao users 4.1K Jan 19 10:40 6-name2semantic.tsv
drwxr-xr-x 5 zhao users 4.0K Jan 19 10:40 logs_s1
看看6-name2semantic.tsv这个文件不空,里面数据对就能 gpt 微调。
训练GPT的时候,遇到了一样的问题,但是我的semantic_data_len并不是0。而且这个跟数据集好像有关系,有的数据集会触发,有的不会
"/opt/anaconda3/envs/GPTSoVITS/bin/python" GPT_SoVITS/s1_train.py --config_file "TEMP/tmp_s1.yaml"
Seed set to 1234
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
<All keys matched successfully>
ckpt_path: None
[rank: 0] Seed set to 1234
Initializing distributed: GLOBAL_RANK: 0, MEMBER: 1/1
----------------------------------------------------------------------------------------------------
distributed_backend=nccl
All distributed processes registered. Starting with 1 processes
----------------------------------------------------------------------------------------------------
semantic_data_len: 238
phoneme_data_len: 239
item_name semantic_audio
0 vocal_milki-snowwhite.m4a_10.wav_15464960_1561... 913 171 657 773 281 766 30 639 882 973 758 991...
1 vocal_milki-snowwhite.m4a_10.wav_16661120_1682... 913 140 714 496 180 129 471 550 783 45 384 66 ...
2 vocal_milki-snowwhite.m4a_10.wav_41444480_4157... 520 280 280 105 271 41 65 509 773 178 247 251 ...
3 vocal_milki-snowwhite.m4a_10.wav_21096640_2123... 520 105 280 280 486 486 486 486 536 609 17 8 8...
4 vocal_milki-snowwhite.m4a_10.wav_26919040_2700... 8 995 595 12 344 187 187 187 46 964 11 777 602...
.. ... ...
233 vocal_milki-snowwhite.m4a_10.wav_28424960_2846... 208 17 515 59 591 994 318 490 312 569 55 595 1...
234 vocal_milki-snowwhite.m4a_10.wav_13336000_1349... 520 105 280 280 280 280 280 280 105 54 17 5 87...
235 vocal_milki-snowwhite.m4a_10.wav_12366080_1248... 54 360 140 570 14 605 74 733 796 550 467 364 8...
236 vocal_milki-snowwhite.m4a_10.wav_32966720_3316... 520 105 271 17 32 187 4 857 751 12 633 749 376...
237 vocal_milki-snowwhite.m4a_10.wav_22626240_2276... 520 280 280 536 1012 524 1002 718 621 211 553 ...
[238 rows x 2 columns]
Traceback (most recent call last):
File "/home/xxx/code/github.com/temp/GPT-SoVITS/GPT_SoVITS/s1_train.py", line 171, in <module>
main(args)
File "/home/xxx/code/github.com/temp/GPT-SoVITS/GPT_SoVITS/s1_train.py", line 147, in main
trainer.fit(model, data_module, ckpt_path=ckpt_path)
File "/opt/anaconda3/envs/GPTSoVITS/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 544, in fit
call._call_and_handle_interrupt(
File "/opt/anaconda3/envs/GPTSoVITS/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 43, in _call_and_handle_interrupt
return trainer.strategy.launcher.launch(trainer_fn, *args, trainer=trainer, **kwargs)
File "/opt/anaconda3/envs/GPTSoVITS/lib/python3.9/site-packages/pytorch_lightning/strategies/launchers/subprocess_script.py", line 102, in launch
return function(*args, **kwargs)
File "/opt/anaconda3/envs/GPTSoVITS/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 580, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/opt/anaconda3/envs/GPTSoVITS/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 950, in _run
call._call_setup_hook(self) # allow user to setup lightning_module in accelerator environment
File "/opt/anaconda3/envs/GPTSoVITS/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 92, in _call_setup_hook
_call_lightning_datamodule_hook(trainer, "setup", stage=fn)
File "/opt/anaconda3/envs/GPTSoVITS/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 179, in _call_lightning_datamodule_hook
return fn(*args, **kwargs)
File "/home/xxx/code/github.com/temp/GPT-SoVITS/GPT_SoVITS/AR/data/data_module.py", line 29, in setup
self._train_dataset = Text2SemanticDataset(
File "/home/xxx/code/github.com/temp/GPT-SoVITS/GPT_SoVITS/AR/data/dataset.py", line 107, in __init__
self.init_batch()
File "/home/xxx/code/github.com/temp/GPT-SoVITS/GPT_SoVITS/AR/data/dataset.py", line 187, in init_batch
for _ in range(max(2, int(min_num / leng))):
ZeroDivisionError: division by zero
for _ in range(max(2, int(min_num / leng))):
ZeroDivisionError: division by zero
同样的情况,6-name2semantic.tsv
有内容但报错。看了下可能和pandas版本有关(我的是2.1.4
)
这里读取后self.semantic_data.iloc[i, 0]
其实取的是value(音频文件名实际是index列)
60: self.semantic_data = pd.read_csv(
61: semantic_path, delimiter="\t", encoding="utf-8"
62: )
修正方法:对 self.semantic_data 做一次 reset_index()
ps: 有不报错的可以帮忙提供个pandas版本和6-name2semantic.tsv,我确认下这样改通用提个MR
我是在训练英语模型时遇到的这个问题,将文本标注文件里的|ZH|改成|en|,重新训练集格式化后解决了,不知道有没有共同性
我是在训练英语模型时遇到的这个问题,将文本标注文件里的|ZH|改成|en|,重新训练集格式化后解决了,不知道有没有共同性
我和你是一样的问题,修改标注后成功了
Indeed, that is correct. One additional tip , after making the necessary annotations, please refrain from directly clicking the "一键三连" button in the formatting interface. Instead, it is advisable to execute the three steps separately. Otherwise, the reformatting will not be thorough and errors may still occur
我也是这个问题,不过 我vits训练和gpt训练都报错
"runtime\python" GPT_SoVITS/s2_train.py --config "TEMP/tmp_s2.json"
INFO:ll:{'train': {'log_interval': 100, 'eval_interval': 500, 'seed': 1234, 'epochs': 12, 'learning_rate': 0.0001, 'betas': [0.8, 0.99], 'eps': 1e-09, 'batch_size': 8, 'fp16_run': True, 'lr_decay': 0.999875, 'segment_size': 20480, 'init_lr_ratio': 1, 'warmup_epochs': 0, 'c_mel': 45, 'c_kl': 1.0, 'text_low_lr_rate': 0.4, 'pretrained_s2G': 'GPT_SoVITS/pretrained_models/s2G488k.pth', 'pretrained_s2D': 'GPT_SoVITS/pretrained_models/s2D488k.pth', 'if_save_latest': True, 'if_save_every_weights': True, 'save_every_epoch': 4, 'gpu_numbers': '0'}, 'data': {'max_wav_value': 32768.0, 'sampling_rate': 32000, 'filter_length': 2048, 'hop_length': 640, 'win_length': 2048, 'n_mel_channels': 128, 'mel_fmin': 0.0, 'mel_fmax': None, 'add_blank': True, 'n_speakers': 300, 'cleaned_text': True, 'exp_dir': 'logs/ll'}, 'model': {'inter_channels': 192, 'hidden_channels': 192, 'filter_channels': 768, 'n_heads': 2, 'n_layers': 6, 'kernel_size': 3, 'p_dropout': 0.1, 'resblock': '1', 'resblock_kernel_sizes': [3, 7, 11], 'resblock_dilation_sizes': [[1, 3, 5], [1, 3, 5], [1, 3, 5]], 'upsample_rates': [10, 8, 2, 2, 2], 'upsample_initial_channel': 512, 'upsample_kernel_sizes': [16, 16, 8, 2, 2], 'n_layers_q': 3, 'use_spectral_norm': False, 'gin_channels': 512, 'semantic_frame_rate': '25hz', 'freeze_quantizer': True}, 's2_ckpt_dir': 'logs/ll', 'content_module': 'cnhubert', 'save_weight_dir': 'SoVITS_weights', 'name': 'll', 'pretrain': None, 'resume_step': None}
INFO:torch.distributed.distributed_c10d:Added key: store_based_barrier_key:1 to store for rank: 0
INFO:torch.distributed.distributed_c10d:Rank 0: Completed store-based barrier for key:store_based_barrierkey:1 with 1 nodes.
logs/ll/2-name2text.txt
Traceback (most recent call last):
File "F:\GPT-SoVITS\GPT-SoVITS\GPT_SoVITS\s2train.py", line 402, in
-- Process 0 terminated with the following error: Traceback (most recent call last): File "F:\GPT-SoVITS_\GPT-SoVITS\runtime\lib\site-packages\torch\multiprocessing\spawn.py", line 69, in wrap fn(i, *args) File "F:\GPT-SoVITS\GPT-SoVITS\GPT_SoVITS\s2_train.py", line 69, in run traindataset = TextAudioSpeakerLoader(hps.data)######## File "F:\GPT-SoVITS\GPT-SoVITS\GPT_SoVITS\module\datautils.py", line 55, in init for in range(max(2, int(min_num / leng))): ZeroDivisionError: division by zero
我是在训练英语模型时遇到的这个问题,将文本标注文件里的|ZH|改成|en|,重新训练集格式化后解决了,不知道有没有共同性
我和你是一样的问题,修改标注后成功了
可以具体说一下吗 谢谢
我是在训练英语模型时遇到的这个问题,将文本标注文件里的|ZH|改成|en|,重新训练集格式化后解决了,不知道有没有共同性
我也是这样,改标注后成功了
你指的是.list里边的标注吗?
你指的是.list里边的标注吗?
将.list里的标注每行的|ZH|换成|en|,不过好像新版本不区分大小写了,我不知道这能否解决所有这类问题。
对对对~!就是这个问题~!!!!感谢所有的朋友,已经解决~~~
对对对~!就是这个问题~!!!!感谢所有的朋友,已经解决~~~
请问你是训练的中文还是英文,我中文改成了小写后仍然都是除0错误。
你仔细检查你的LIST文件,训练时大部分报错都是它出了问题
你仔细检查你的LIST文件,训练时大部分报错都是它出了问题
可以附一张能够成功训练的list文件截图么,单对比主页的example,我不是很能看出哪有问题。
@Liu-yixi 这一条可能有帮助
Indeed, that is correct. One additional tip , after making the necessary annotations, please refrain from directly clicking the "一键三连" button in the formatting interface. Instead, it is advisable to execute the three steps separately. Otherwise, the reformatting will not be thorough and errors may still occur
似乎这个问题已经解决了,注意训练集list文件格式。
我是中文训练集出了问题,后来不用一键三连,单个单个反复点几次莫名其妙就好了
GPT 微调报错,但是 1Ba-SoVITS训练 是可以的。 GPT 的训练报错:
感觉前一步训练集格式化,开启 SSL 提取就有点问题,不知道和这个有关系没。