Artrajz / vits-simple-api

A simple VITS HTTP API, developed by extending Moegoe with additional features.
GNU Affero General Public License v3.0
777 stars 116 forks source link

Update GPT-SoVITS support #146

Closed Artrajz closed 7 months ago

Artrajz commented 7 months ago
Lemondogdog commented 6 months ago

请问下更新的这个GPT-SoVITS的语音合成,也是能支持GPU加速的吗?

Artrajz commented 6 months ago

支持的

Lemondogdog commented 6 months ago

支持的

请帮我看下我这样写模型的加载路径格式方式对吗? 屏幕截图 2024-02-28 194232 然后我这样设置后双击start.bat打开后出现以下错误是什么原因呢 INFO:root:Loading config... Traceback (most recent call last): File "G:\vits-simple-api-windows-gpu-v0.6.9\config.py", line 445, in load_config loaded_config = yaml.safe_load(f) File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml__init__.py", line 125, in safe_load return load(stream, SafeLoader) File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml__init.py", line 81, in load return loader.get_single_data() File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml\constructor.py", line 49, in get_single_data node = self.get_single_node() File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml\composer.py", line 36, in get_single_node document = self.compose_document() File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml\composer.py", line 55, in compose_document node = self.compose_node(None, None) File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml\composer.py", line 84, in compose_node node = self.compose_mapping_node(anchor) File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml\composer.py", line 133, in compose_mapping_node item_value = self.compose_node(node, item_key) File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml\composer.py", line 84, in compose_node node = self.compose_mapping_node(anchor) File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml\composer.py", line 133, in compose_mapping_node item_value = self.compose_node(node, item_key) File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml\composer.py", line 64, in compose_node if self.check_event(AliasEvent): File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml\parser.py", line 98, in check_event self.current_event = self.state() File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml\parser.py", line 449, in parse_block_mapping_value if not self.check_token(KeyToken, ValueToken, BlockEndToken): File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml\scanner.py", line 116, in check_token self.fetch_more_tokens() File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml\scanner.py", line 215, in fetch_more_tokens return self.fetch_block_entry() File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\yaml\scanner.py", line 491, in fetch_block_entry raise ScannerError(None, None, yaml.scanner.ScannerError: sequence entries are not allowed here in "G:\vits-simple-api-windows-gpu-v0.6.9\config.yaml", line 25, column 11 ERROR:root:None Building prefix dict from the default dictionary ... DEBUG:jieba:Building prefix dict from the default dictionary ... Loading model from cache C:\Users\67523\AppData\Local\Temp\jieba.cache DEBUG:jieba:Loading model from cache C:\Users\67523\AppData\Local\Temp\jieba.cache Loading model cost 0.775 seconds. DEBUG:jieba:Loading model cost 0.775 seconds. Prefix dict has been built successfully. DEBUG:jieba:Prefix dict has been built successfully. [nltk_data] Error loading averaged_perceptron_tagger: <urlopen error [nltk_data] [Errno 11004] getaddrinfo failed> [nltk_data] Error loading cmudict: <urlopen error [Errno 11004] [nltk_data] getaddrinfo failed> Traceback (most recent call last): File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\nltk\corpus\util.py", line 84, in load root = nltk.data.find(f"{self.subdir}/{zip_name}") File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\nltk\data.py", line 583, in find raise LookupError(resource_not_found) LookupError:


Resource cmudict not found. Please use the NLTK Downloader to obtain the resource:

import nltk nltk.download('cmudict')

For more information see: https://www.nltk.org/data.html

Attempted to load corpora/cmudict.zip/cmudict/

Searched in:

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "G:\vits-simple-api-windows-gpu-v0.6.9\app.py", line 10, in from tts_app.frontend.views import frontend File "G:\vits-simple-api-windows-gpu-v0.6.9\tts_app\frontend\views.py", line 3, in from tts_app.model_manager import model_manager File "G:\vits-simple-api-windows-gpu-v0.6.9\tts_app\model_manager.py", line 1, in from manager.ModelManager import ModelManager File "G:\vits-simple-api-windows-gpu-v0.6.9\manager\ModelManager.py", line 15, in from bert_vits2 import Bert_VITS2 File "G:\vits-simple-api-windows-gpu-v0.6.9\bert_vits2__init__.py", line 1, in from bert_vits2.bert_vits2 import Bert_VITS2 File "G:\vits-simple-api-windows-gpu-v0.6.9\bert_vits2\bert_vits2.py", line 13, in from bert_vits2.text.cleaner import clean_text File "G:\vits-simple-api-windows-gpu-v0.6.9\bert_vits2\text\cleaner.py", line 1, in from bert_vits2.text import chinese, japanese, english, cleaned_text_to_sequence, japanese_v111, chinese_v100, \ File "G:\vits-simple-api-windows-gpu-v0.6.9\bert_vits2\text\english.py", line 11, in _g2p = G2p() File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\g2p_en\g2p.py", line 71, in init self.cmu = cmudict.dict() File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\nltk\corpus\util.py", line 121, in getattr self.load() File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\nltk\corpus\util.py", line 86, in load raise e File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\nltk\corpus\util.py", line 81, in load root = nltk.data.find(f"{self.subdir}/{self.name}") File "G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\site-packages\nltk\data.py", line 583, in find raise LookupError(resource_not_found) LookupError:


Resource cmudict not found. Please use the NLTK Downloader to obtain the resource:

import nltk nltk.download('cmudict')

For more information see: https://www.nltk.org/data.html

Attempted to load corpora/cmudict

Searched in:

请按任意键继续. . .

Artrajz commented 6 months ago

填写规范:

tts_config:
  auto_load: false
  models:
  - config_path: model1/config.json
    model_path: model1/G_1000.pth
  - config_path: model2/config.json
    model_path: model2/G_1000.pth
    # GPT-SoVITS则为
  - sovits_path: gpt_sovits1/model1_e8_s11536.pth
    gpt_path: gpt_sovits1/model1-e15.ckpt
  - sovits_path: gpt_sovits2/model2_e8_s11536.pth
    gpt_path: gpt_sovits2/model2-e15.ckpt

报错中提示缺少nltk_data.zip

按照日志,下载后解压至任一位置

Searched in:
- 'C:\Users\67523/nltk_data'
- 'G:\vits-simple-api-windows-gpu-v0.6.9\py310\nltk_data'
- 'G:\vits-simple-api-windows-gpu-v0.6.9\py310\share\nltk_data'
- 'G:\vits-simple-api-windows-gpu-v0.6.9\py310\lib\nltk_data'
- 'C:\Users\67523\AppData\Roaming\nltk_data'
- 'C:\nltk_data'
- 'D:\nltk_data'
- 'E:\nltk_data'
Lemondogdog commented 6 months ago

填写规范:

tts_config:
  auto_load: false
  models:
  - config_path: model1/config.json
    model_path: model1/G_1000.pth
  - config_path: model2/config.json
    model_path: model2/G_1000.pth
  # GPT-SoVITS则为
  - sovits_path: gpt_sovits1/model1_e8_s11536.pth
    gpt_path: gpt_sovits1/model1-e15.ckpt
  - sovits_path: gpt_sovits2/model2_e8_s11536.pth
    gpt_path: gpt_sovits2/model2-e15.ckpt

sovits的模型如果不手动填写路径只是放在那个data\models下的文件夹下它也会自动加载吗。然后sovits也要有个参考音频它也放这里面吗?

Artrajz commented 6 months ago

sovits的模型如果不手动填写路径只是放在那个data\models下的文件夹下它也会自动加载吗

GPT-SoVITS的模型放在data\models下会自动加载的。当auto_load: true时,将.pth和.ckpt两个模型放在同一个文件夹里就会自动加载。 如果说的就是sovits的模型,API目前仍然是不支持此类模型的。

然后sovits也要有个参考音频它也放这里面吗?

参考音频要另外填写,不随模型加载。参考音频可以提前在presets中填写,也可以在调用时上传。

Lemondogdog commented 6 months ago

sovits的模型如果不手动填写路径只是放在那个data\models下的文件夹下它也会自动加载吗

GPT-SoVITS的模型放在data\models下会自动加载的。当auto_load: true时,将.pth和.ckpt两个模型放在同一个文件夹里就会自动加载。 如果说的就是sovits的模型,API目前仍然是不支持此类模型的。

然后sovits也要有个参考音频它也放这里面吗?

参考音频要另外填写,不随模型加载。参考音频可以提前在presets中填写,也可以在调用时上传。

明白了 我发现参考音频如果带有情绪的话sovits的合成结果也会带有情绪音。是否下次更新在API请求的时候能灵活自由指定参考音频reference_audio的音频路径呢?这样就可灵活控制sovits的情绪输出了。

Artrajz commented 6 months ago

参考音频本身就是可以自由选择的,如果说手动上传会有点麻烦,那可以用参考音频预设presets。只要提前设置好需要的情绪对应一个参考音频,调用的时候选择这个preset即可。

presets:
    default: # 保留一个default不能更改
      refer_wav_path: null # 参考音频路径
      prompt_text: null # 参考音频文本
      prompt_lang: auto # 参考音频语言
    default2: # 这下面的可以更改名字,也可以继续添加
      refer_wav_path: null
      prompt_text: null
      prompt_lang: auto
Lemondogdog commented 6 months ago

参考音频本身就是可以自由选择的,如果说手动上传会有点麻烦,那可以用参考音频预设presets。只要提前设置好需要的情绪对应一个参考音频,调用的时候选择这个preset即可。

presets:
    default: # 保留一个default不能更改
      refer_wav_path: null # 参考音频路径
      prompt_text: null # 参考音频文本
      prompt_lang: auto # 参考音频语言
    default2: # 这下面的可以更改名字,也可以继续添加
      refer_wav_path: null
      prompt_text: null
      prompt_lang: auto

明白了原来如此就提取预设好需要的参考音频到时候API调用只需更改presets后面的名称。这些都是在config.yaml最后面那边改和添加的吧。感谢解答。

Lemondogdog commented 6 months ago

参考音频本身就是可以自由选择的,如果说手动上传会有点麻烦,那可以用参考音频预设presets。只要提前设置好需要的情绪对应一个参考音频,调用的时候选择这个preset即可。

presets:
    default: # 保留一个default不能更改
      refer_wav_path: null # 参考音频路径
      prompt_text: null # 参考音频文本
      prompt_lang: auto # 参考音频语言
    default2: # 这下面的可以更改名字,也可以继续添加
      refer_wav_path: null
      prompt_text: null
      prompt_lang: auto

我好像发现一个bug就是再 default2: # 这下面的可以更改名字,也可以继续添加 refer_wav_path: null prompt_text: null prompt_lang: auto 这些改过后再网页端直接生成它 default2的参考音频好像合成出来的声音只有叹气声, 但是手动网页端用那个reference_audio的browse加载新的参考音后在生成就没问题。

Artrajz commented 6 months ago

我测试没有问题哦,不管是在default2修改的,还是新增加的。不知道你是怎么设置的?

Lemondogdog commented 6 months ago

我测试没有问题哦,不管是在default2修改的,还是新增加的。不知道你是怎么设置的?

可以了是我上面的信息第填错进去了第一个全部值都要保留原来的才行。我以为只是名字保留。然后我想问下如果这种参考音频路径想要让它可以输入相对路径的话得再哪里改呢?这样才不会说哪天移动项目到其它盘了又要重新填写参考音频的路径吧。

Artrajz commented 6 months ago

然后我想问下如果这种参考音频路径想要让它可以输入相对路径的话得再哪里改呢?这样才不会说哪天移动项目到其它盘了又要重新填写参考音频的路径吧。

针对这个需求进行了一次更新https://github.com/Artrajz/vits-simple-api/commit/fca25258ca46e1465e6ae1d1785dce7052572c68 相对路径也是从data文件夹开始,比如有音频data/reference_auidio/1.wav,填写refer_wav_path: reference_auidio/1.wav

另外也更新了预设选择的规则,现在预设中的default也可以被随意修改甚至删除了https://github.com/Artrajz/vits-simple-api/commit/f91d17d7ad78558d777960b730aae9f123763bbd