2月2-4日的改动后的效果实在是太差了

AnonymousmousCoder commented 9 months ago

我每天都会同步的commits，选择性的去应用更改。

今天我进行测试时，发现同样的底膜+同样的参考音频，效果变差了很多很多。前几天还是正常的，和参考音频风格和情感都很像。今天的推理结果就变成了一个一个字往外蹦，就像外国人说中文一样。

我认为这个效果变化是由于 text/chinese.py或者inference_webui新加的max\min导致的。

AnonymousmousCoder commented 9 months ago

我同时在windows和linux上进行了测试，效果真的变差了很多

AnonymousmousCoder commented 9 months ago

我先是回退到这个版本6410628a5fd17380b5c3bc91750508daccc1184c，依旧很差，阴阳怪气。再回退到更早一点的这个版本ab9849344cb79b8a9842d18426b3321f4b9a07b1，效果很好。

RVC-Boss commented 9 months ago

我排查一下

RVC-Boss commented 9 months ago

我先是回退到这个版本6410628a5fd17380b5c3bc91750508daccc1184c，依旧很差，阴阳怪气。再回退到更早一点的这个版本ab9849344cb79b8a9842d18426b3321f4b9a07b1，效果很好。

@AnonymousmousCoder 还能再用二分法缩小一下范围吗？

RVC-Boss commented 9 months ago

或者把模型+参考+2个文本发我我测试一下

AnonymousmousCoder commented 9 months ago

定位到效果变化位于： ...效果差测试 3ebff70b71580ee1f97b3238c9442cbc5aef47c7 效果差测试 9286a27ad3608cf81ef122c3b06a681765e7490e 效果正常测试 dba1a74ccb0cf19a1b4eb93faf11d4ec2b1fc5d7 效果正常 ...效果正常

参考音频的文本是（微软tts云泽生成的）："一道闪电划破天际,把整个圣地照耀如白昼一样,经久不息" （云泽男中年不满-0.8，语速-0.85）

输入文本是 ‘’‘他不是这里的土著，而是在几个月前穿越过来的。当时，他看似稳如老狗，实际上慌得一批。因为这里，居然是洪荒世界；而且根据日子推算，不久后就是封神榜之战！‘’‘

RVC-Boss commented 9 months ago

定位到效果变化位于： ...效果差测试 3ebff70 效果差测试 9286a27 效果正常测试 dba1a74 效果正常 ...效果正常

参考音频的文本是（微软tts云泽生成的）："一道闪电划破天际,把整个圣地照耀如白昼一样,经久不息"

输入文本是 ‘’‘他不是这里的土著，而是在几个月前穿越过来的。当时，他看似稳如老狗，实际上慌得一批。因为这里，居然是洪荒世界；而且根据日子推算，不久后就是封神榜之战！‘’‘

https://github.com/RVC-Boss/GPT-SoVITS/commit/3ebff70b71580ee1f97b3238c9442cbc5aef47c7 看了下这个commit，是给中英混合（名称：中文）+日英混合（名称：日文）+ 英文改成了中文（老版纯中文）+日文（老版纯日文）+中英混合+日英混合+多语言自动切割 @AnonymousmousCoder 你于3ebff70和9286a27的两次测试，选择的目标语种分别是？

RVC-Boss commented 9 months ago

模型是默认的zero shot是吗？

andylin12 commented 9 months ago

下面这段实现有问题，传入的language没有起效，都使用text_language，（另外引用的text也应该传入）

def get_bert_final(phones, word2ph, norm_text,language,device):
    if text_language == "en":
        bert = get_bert_inf(phones, word2ph, norm_text, text_language)
    elif text_language in {"zh", "ja","auto"}:
        bert = nonen_get_bert_inf(text, text_language)
    elif text_language == "all_zh":
        bert = get_bert_feature(norm_text, word2ph).to(device)
    else:
        bert = torch.zeros((1024, len(phones))).to(device)
    return bert

而在这里传入的prompt_language没有作用，prompt_language和text_language选项一致的时候看不出问题：

bert1=get_bert_final(phones1, word2ph1, norm_text1,prompt_language,device).to(dtype)

但上面那位朋友如果两个都是中文，应该不是这里的问题。

我用的是最新的commit，中文转中文效果正常。但我是自己写的脚本，基于以前的逻辑，移植了最新改动，移植的时候不一样的就是上面这里。

AnonymousmousCoder commented 9 months ago

模型是默认的zero shot是吗？

是的，模型就是官方默认的。输入和输出语言选项都是中文，没有改动。使用的推理webui

windkind88 commented 9 months ago

模型是默认的zero shot是吗？

是的，模型就是官方默认的。输入和输出语言选项都是中文，没有改动。使用的推理webui

朋友你目前最新稳定版本是9286a27？

AnonymousmousCoder commented 9 months ago

模型是默认的zero shot是吗？

是的，模型就是官方默认的。输入和输出语言选项都是中文，没有改动。使用的推理webui

朋友你目前最新稳定版本是9286a27？

是的

Stanley-baby commented 9 months ago

那现在怎么整?应该用哪个模型？在 2024年2月5日 +0800 21:34，AnonymousmousCoder @.***>，写道：

模型是默认的zero shot是吗？是的，模型就是官方默认的。输入和输出语言选项都是中文，没有改动。使用的推理webui 朋友你目前最新稳定版本是9286a27？是的 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

RVC-Boss commented 9 months ago

@AnonymousmousCoder 现在正常了吗？我修复了下inference_webui.py。编辑：好像不是这个问题。我明天用云泽测下。

AnonymousmousCoder commented 9 months ago

那现在怎么整?应该用哪个模型？经我个人用例测试，我觉得最好的版本是 ab9849344cb79b8a9842d18426b3321f4b9a07b1，我用云泽测，克隆效果简直一摸一样。

Stanley-baby commented 9 months ago

请问这个模型在哪里下载呢？发件人: AnonymousmousCoder ***@***.***>日期: 星期二, 2024年2月6日 08:26收件人: RVC-Boss/GPT-SoVITS ***@***.***>抄送: Stanley-baby ***@***.***>, Comment ***@***.***>主题: Re: [RVC-Boss/GPT-SoVITS] 2月2-4日的改动后的效果实在是太差了 (Issue #391)那现在怎么整?应该用哪个模型？经我个人用例测试，我觉得最好的版本是 ab98493，我用云泽测，克隆效果简直一摸一样。—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>

RVC-Boss commented 9 months ago

@AnonymousmousCoder 已修复，更新inference_webui.py即可！（复现的话，可以参考和目标都选中英混合）

经过排查，由于几个新加的函数里的传参text_language，实际执行（变量写错了）为不在传参里的language导致混乱（居然是能跑而不是报错），bert全部为0.

aiminsu commented 9 months ago

@AnonymousmousCoder 已修复，更新inference_webui.py即可！（复现的话，可以参考和目标都选中英混合）

经过排查，由于几个新加的函数里的传参text_language，实际执行（变量写错了）为不在传参里的language导致混乱（居然是能跑而不是报错），bert全部为0. @RVC-Boss
def get_bert_inf(phones, word2ph, norm_text, language):
language=language.replace("all_","")
if language == "zh":
bert = get_bert_feature(norm_text, word2ph).to(device)#.to(dtype)
else:
bert = torch.zeros(
(1024, len(phones)),
dtype=torch.float16 if is_half == True else torch.float32,
).to(device)
return bert

请问get_bert_inf函数这里当语言非中文时，为什么bert特征可以用0补，而不是用对应的模型提取呢

RVC-Boss / GPT-SoVITS

2月2-4日的改动后的效果实在是太差了 #391