RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
MIT License
29.76k stars 3.44k forks source link

中文内容报错RuntimeError: torch.cat(): expected a non-empty list of Tensors #854

Closed ILostMyPig closed 4 months ago

ILostMyPig commented 4 months ago

需要合成的文本:遮住。 命令行输出:

Number of parameter: 77.49M DEBUG:root:Using proactor: IocpProactor DEBUG:root:Using proactor: IocpProactor Running on local URL: http://0.0.0.0:9872 实际输入的参考文本: 不必了,就算龙女大人的医术通神,对长生种的宿命,恐怕也是无可奈何吧。 实际输入的目标文本: 。遮住。 实际输入的目标文本(切句后): 。遮住。 Building prefix dict from the default dictionary ... DEBUG:jieba_fast:Building prefix dict from the default dictionary ... Loading model from cache I:\GPT-SoVITS-beta0306fix2\TEMP\jieba.cache DEBUG:jieba_fast:Loading model from cache I:\GPT-SoVITS-beta0306fix2\TEMP\jieba.cache Loading model cost 0.378 seconds. DEBUG:jieba_fast:Loading model cost 0.378 seconds. Prefix dict has been built succesfully. DEBUG:jieba_fast:Prefix dict has been built succesfully. 前端处理后的参考文本:不必了,就算龙女大人的医术通神,对长生种的宿命,恐怕也是无可奈何吧. 实际输入的目标文本(每句): 。遮住。 前端处理后的文本(每句): Traceback (most recent call last): File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\gradio\routes.py", line 321, in run_predict output = await app.blocks.process_api( File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\gradio\blocks.py", line 1006, in process_api result = await self.call_function(fn_index, inputs, iterator, request) File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\gradio\blocks.py", line 859, in call_function prediction = await anyio.to_thread.run_sync( File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\anyio\to_thread.py", line 31, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread return await future File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run result = context.run(func, *args) File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\gradio\utils.py", line 408, in async_iteration return next(iterator) File "I:\GPT-SoVITS-beta0306fix2\GPT_SoVITS\inference_webui.py", line 440, in get_tts_wav bert2 = get_bert_final(phones2, word2ph2, norm_text2, text_language, device).to(dtype) File "I:\GPT-SoVITS-beta0306fix2\GPT_SoVITS\inference_webui.py", line 348, in get_bert_final bert = get_bert_feature(text, word2ph).to(device) File "I:\GPT-SoVITS-beta0306fix2\GPT_SoVITS\inference_webui.py", line 100, in get_bert_feature phone_level_feature = torch.cat(phone_level_feature, dim=0) RuntimeError: torch.cat(): expected a non-empty list of Tensors
ILostMyPig commented 4 months ago

找到原因了。 在inference_webui.py文件的clean_text_inf(text, language)函数中,LangSegment.getTexts(text)的返回值需要进一步处理。 在推理“遮住”这个词时,语言选“中文”,LangSegment.getTexts(text)返回的lang应该为zh,但实际上的lang为ja。 我猜测,因为日文中也有“遮”这个字,所以判断为日文。 需要把clean_text_inf(text, language)函数中替换为如下代码,问题就能解决:

def clean_text_inf(text, language):
    formattext = ""
    language_tmp_zh = False
    if language == "all_zh":
        language_tmp_zh = True
    language = language.replace("all_","")
    for tmp in LangSegment.getTexts(text):
        if language_tmp_zh:
            tmp["lang"] = "zh"
        if language == "ja":
            if tmp["lang"] == language or tmp["lang"] == "zh":
                formattext += tmp["text"] + " "
            continue
        if tmp["lang"] == language:
            formattext += tmp["text"] + " "
    while "  " in formattext:
        formattext = formattext.replace("  ", " ")
    phones, word2ph, norm_text = clean_text(formattext, language)
    phones = cleaned_text_to_sequence(phones)
    return phones, word2ph, norm_text
ILostMyPig commented 4 months ago

合成其它文本时,又出现了相同报错,看来我的修补并不全面。同时发现了一个奇怪的现象。 需要合成的文本:“呕……呕! 命令行输出:

Number of parameter: 77.49M
DEBUG:root:Using proactor: IocpProactor
DEBUG:root:Using proactor: IocpProactor
Running on local URL:  http://0.0.0.0:9872
实际输入的参考文本: 头说得没错,对付不讲道理的家伙,刀尖子就是道理。
实际输入的目标文本: 。“呕……呕!
实际输入的目标文本(切句后): 。
“呕……呕!
Building prefix dict from the default dictionary ...
DEBUG:jieba_fast:Building prefix dict from the default dictionary ...
Dumping model to file cache C:\Users\zzs\AppData\Local\Temp\jieba.cache
DEBUG:jieba_fast:Dumping model to file cache C:\Users\zzs\AppData\Local\Temp\jieba.cache
Loading model cost 0.420 seconds.
DEBUG:jieba_fast:Loading model cost 0.420 seconds.
Prefix dict has been built succesfully.
DEBUG:jieba_fast:Prefix dict has been built succesfully.
前端处理后的参考文本:头说得没错,对付不讲道理的家伙,刀尖子就是道理.
实际输入的目标文本(每句): 。“呕……呕!
前端处理后的文本(每句):
Traceback (most recent call last):
  File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\gradio\routes.py", line 321, in run_predict
    output = await app.blocks.process_api(
  File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\gradio\blocks.py", line 1006, in process_api
    result = await self.call_function(fn_index, inputs, iterator, request)
  File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\gradio\blocks.py", line 859, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\gradio\utils.py", line 408, in async_iteration
    return next(iterator)
  File "I:\GPT-SoVITS-beta0306fix2\GPT_SoVITS\inference_webui.py", line 445, in get_tts_wav
    bert2 = get_bert_final(phones2, word2ph2, norm_text2, text_language, device).to(dtype)
  File "I:\GPT-SoVITS-beta0306fix2\GPT_SoVITS\inference_webui.py", line 353, in get_bert_final
    bert = get_bert_feature(text, word2ph).to(device)
  File "I:\GPT-SoVITS-beta0306fix2\GPT_SoVITS\inference_webui.py", line 100, in get_bert_feature
    phone_level_feature = torch.cat(phone_level_feature, dim=0)
RuntimeError: torch.cat(): expected a non-empty list of Tensors

奇怪的现象: 需要合成的文本:呕。 可以成功合成。那么从逻辑上,我合成两句“呕。”,就能勉强获得我原先需要的“呕……呕。”的语音了吧,但事实并不是。 然后把切割方式改按符号切割。 需要合成的文本:呕。呕。 居然就出错了,从命令行来看,并没有切割文本?命令行输出:


实际输入的参考文本: 头说得没错,对付不讲道理的家伙,刀尖子就是道理。
实际输入的目标文本: 。呕。呕。
实际输入的目标文本(切句后): 。呕。呕。
前端处理后的参考文本:头说得没错,对付不讲道理的家伙,刀尖子就是道理.
实际输入的目标文本(每句): 。呕。呕。
前端处理后的文本(每句):
Traceback (most recent call last):
  File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\gradio\routes.py", line 321, in run_predict
    output = await app.blocks.process_api(
  File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\gradio\blocks.py", line 1006, in process_api
    result = await self.call_function(fn_index, inputs, iterator, request)
  File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\gradio\blocks.py", line 859, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\anyio\to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\anyio\_backends\_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\anyio\_backends\_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "I:\GPT-SoVITS-beta0306fix2\runtime\lib\site-packages\gradio\utils.py", line 408, in async_iteration
    return next(iterator)
  File "I:\GPT-SoVITS-beta0306fix2\GPT_SoVITS\inference_webui.py", line 441, in get_tts_wav
    bert2 = get_bert_final(phones2, word2ph2, norm_text2, text_language, device).to(dtype)
  File "I:\GPT-SoVITS-beta0306fix2\GPT_SoVITS\inference_webui.py", line 349, in get_bert_final
    bert = get_bert_feature(text, word2ph).to(device)
  File "I:\GPT-SoVITS-beta0306fix2\GPT_SoVITS\inference_webui.py", line 100, in get_bert_feature
    phone_level_feature = torch.cat(phone_level_feature, dim=0)
RuntimeError: torch.cat(): expected a non-empty list of Tensors
``
ILostMyPig commented 4 months ago

找到原因了,依旧是语言问题。 这次就很奇葩了,在LangSegment.py文件的_parse_language(words , segment)函数中,language被判断为he,我不想去管变成he 的原因了,实在太妙了。 反正我只需要中文,直接粗暴改代码:

#_parse_language(words , segment)的倒数第三行
LangSegment._addwords(words,language,text,score)
#在这行前面加一句,变为
language=“zh”
LangSegment._addwords(words,language,text,score)

反正问题解决了,这个issue还是留开发者来close吧,希望能真正修复语言的问题。

KamioRinn commented 4 months ago

这个问题已经在很早之前的版本修复了,请更新到最新代码及依赖

ILostMyPig commented 4 months ago

这个问题已经在很早之前的版本修复了,请更新到最新代码及依赖

我用的0306fix2版本,这.........

KamioRinn commented 4 months ago

这个问题已经在很早之前的版本修复了,请更新到最新代码及依赖

我用的0306fix2版本,这.........

https://github.com/RVC-Boss/GPT-SoVITS/pull/556 你提供的代码在这个pr里已经被修改了。

现在两个分支都不存在相关代码。

ILostMyPig commented 4 months ago

这个问题已经在很早之前的版本修复了,请更新到最新代码及依赖

我用的0306fix2版本,这.........

556 你提供的代码在这个pr里已经被修改了。

现在两个分支都不存在相关代码。

好的 谢谢!