PlayVoice / whisper-vits-svc

Core Engine of Singing Voice Conversion & Singing Voice Clone
https://huggingface.co/spaces/maxmax20160403/sovits5.0
MIT License
2.57k stars 914 forks source link

x新的diff分支的训练的tensorboard中大量电音,这是正常情况吗 #111

Open KIKscanf opened 10 months ago

KIKscanf commented 10 months ago

mel图看不出是人声的频谱,可能是训练集少?(4.5min,但是游戏解包音质)

individualImage

KIKscanf commented 10 months ago

1000step,bs=36,lr=2e-4

MaxMax2016 commented 10 months ago

不正常,你是先训练好bigvgan-mix-v2的模型后,然后在训好的模型上面训练的diff分支吗?

KIKscanf commented 10 months ago

是的,不管是用bigvgan-mix-v2的模型还是从GVC拉的底模都是这个情况,但loss却稳定下降,0.4的样子,从1000练到8000step情况也完全没有好转

MaxMax2016 commented 10 months ago

你是4分钟的数据吗?数据是否敏感?能发给我试试吗?

KIKscanf commented 10 months ago

是的,可以链接:https://pan.baidu.com/s/1GVZ-lgmDOYYMvFM8gmnskA?pwd=KIK1 提取码:KIK1 --来自百度网盘超级会员V2的分享

MaxMax2016 commented 10 months ago

这是使用GradSVC训练150epoch的日志和模型,现在GradSVC音色相似度还不够,但没出现上面的那种me 链接:https://pan.baidu.com/s/1eOBXMOt0Bh9DB-HcwZPvEw?pwd=t8nf 提取码:t8nf

5.0我明天再试试

KIKscanf commented 10 months ago

GVC是没问题的,不过5.0那个可能有问题

MaxMax2016 commented 10 months ago

我先发个我的diff插件模型给你测测吧,我也在复现中~~~ 链接:https://pan.baidu.com/s/1z5IDJ6Sm7oepv5yg-KZK5w?pwd=4r77 提取码:4r77

我复现的时候,忘记设置主模型了,得重来,设置好会有这样的打印: python svc_trainer.py --config configs/base.yaml --name plug Batch size per GPU : 8 ----------10---------- 2023-09-06 06:31:23,136 - INFO - Start from 32k pretrain model: sovits5.0_1100.pt post.estimator.spk_mlp.0.weight is not in the checkpoint post.estimator.spk_mlp.0.bias is not in the checkpoint post.estimator.spk_mlp.2.weight is not in the checkpoint post.estimator.spk_mlp.2.bias is not in the checkpoint post.estimator.mlp.0.weight is not in the checkpoint post.estimator.mlp.0.bias is not in the checkpoint post.estimator.mlp.2.weight is not in the checkpoint post.estimator.mlp.2.bias is not in the checkpoint post.estimator.downs.0.0.mlp.1.weight is not in the checkpoint post.estimator.downs.0.0.mlp.1.bias is not in the checkpoint post.estimator.downs.0.0.block1.block.0.weight is not in the checkpoint post.estimator.downs.0.0.block1.block.0.bias is not in the checkpoint

项目代码里面,post是plug,后来改的名字:post->plug

MaxMax2016 commented 10 months ago

主模型模没设置就是上面那个图

mel图看不出是人声的频谱,可能是训练集少?(4.5min,但是游戏解包音质)

individualImage

这是正常的mel图 plug_mel

KIKscanf commented 10 months ago

我测试一下

Taiwan1912 commented 10 months ago

未命名 5.0照著步驟弄使用起來沒什麼問題產生 【中山美穂 WANDS《世界中の誰よりきっと》Cover by 岩崎宏美 |Sovits5.0 Bigvgan-mix-v2】 https://www.bilibili.com/video/BV1234y1T7dH/?share_source=copy_web&vd_source=1a855607b0e7432ab1f93855e5b45f7d

KIKscanf commented 10 months ago

清楚了,是底模路径错误