大佬，可以吧你的模型共享出来吗？

jwister commented 1 year ago

如题

Artrajz commented 1 year ago

模型是其他大佬做的，你可以在这里找到我所使用的模型 https://github.com/CjangCjengh/TTSModels https://huggingface.co/spaces/zomehwh/vits-uma-genshin-honkai/tree/main/model

jwister commented 1 year ago

您的邮件已收到，谢谢。

jwister commented 1 year ago

感谢，已经用起来了，现在的问题是在哪里开启gpu加速啊？有配置项吗？

Artrajz commented 1 year ago

感谢，已经用起来了，现在的问题是在哪里开启gpu加速啊？有配置项吗？

需要安装cuda和gpu版pytorch，安装好后会自动调用gpu。

jwister commented 1 year ago

安装好了，电脑也重启了，重新跑的时候还是没显示启用gpu加速。后面吧集成显卡禁用了，也没用，是我的显卡是mx250 是需要设置哪里吗？

Artrajz commented 1 year ago

安装好了，电脑也重启了，重新跑的时候还是没显示启用gpu加速。后面吧集成显卡禁用了，也没用，是我的显卡是mx250 是需要设置哪里吗？

你验证下cuda是否安装成功，mx250应该要找对应的版本安装。然后是pytorch，在vits启动时会打印pytorch版本信息，版本是x.x.x+cu1xx（x是数字）的才是可以使用cuda的。

cgnannan commented 1 year ago

模型是其他大佬做的，你可以在这里找到我所使用的模型 https://github.com/CjangCjengh/TTSModels https://huggingface.co/spaces/zomehwh/vits-uma-genshin-honkai/tree/main/model

大佬，目前vits有英语模型么？今天在网上搜了半天，也没找到英语Model。

Artrajz commented 1 year ago

https://github.com/jaywalnut310/vits 原仓库有英语模型，不过需要稍微改下代码并另外安装espeak才能使用

Artrajz commented 1 year ago

305cae870afcaf53d75b8edae5f10e1d8b3d87e9 在对应的json文件中添加以下两行才能使用

"speakers": ["vctk"],
"symbols":  ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"]

cgnannan commented 1 year ago

305cae8 在对应的json文件中添加以下两行才能使用

"speakers": ["vctk"],
"symbols":  ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"]

谢谢大佬，我试试去

cgnannan commented 1 year ago

https://github.com/jaywalnut310/vits 原仓库有英语模型，不过需要稍微改下代码并另外安装espeak才能使用

大佬，看了原仓库的README,是否要拿数据集训练，才能得到英语语音模型？在仓库文件夹里，没找到Model文件

"3.Download datasets Download and extract the LJ Speech dataset, then rename or create a link to the dataset folder: ln -s /path/to/LJSpeech-1.1/wavs DUMMY1 For mult-speaker setting, download and extract the VCTK dataset, and downsample wav files to 22050 Hz. Then rename or create a link to the dataset folder: ln -s /path/to/VCTK-Corpus/downsampled_wavs DUMMY2

4.Build Monotonic Alignment Search and run preprocessing if you use your own datasets."

Artrajz commented 1 year ago

不用，他在README里提供了预训练模型的下载链接，你可以直接使用预训练模型，或者在这个模型上继续训练。而json文件可以在仓库的configs里找到

cgnannan commented 1 year ago

305cae8 在对应的json文件中添加以下两行才能使用

"speakers": ["vctk"],
"symbols":  ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"]

谢谢大佬，我试试去

大佬，我先git pull了您最新的代码。然后加载了预训练模型，也将您这两行代码更新到模型对应的json文件，python.app后报出以下错误

(fort) E:\Fort\WechatBot\vits-simple-api>python app.py INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/Nene_Nanami_Rong_Tang/1374_epochs.pth' (iteration None) INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/Zero_no_tsukaima/1158_epochs.pth' (iteration None) INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/g/G_953000.pth' (iteration 630) INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/Voistock/547_epochs.pth' (iteration None) INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/ljs/ljs.pth' (iteration 0) Traceback (most recent call last): File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 354, in _check_seekable f.seek(f.tell()) AttributeError: 'NoneType' object has no attribute 'seek'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "E:\Fort\WechatBot\vits-simple-api\app.py", line 28, in tts = merge_model(app.config["MODEL_LIST"]) File "E:\Fort\WechatBot\vits-simple-api\utils\merge.py", line 55, in merge_model obj = vits(model=i[0], config=i[1]) File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 54, in init self.loadmodel(model, model) File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 61, in load_model self.hubert = hubertsoft(model) File "E:\Fort\WechatBot\vits-simple-api\hubert_model.py", line 217, in hubert_soft checkpoint = torch.load(path) File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 791, in load with _open_file_like(f, 'rb') as opened_file: File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 276, in _open_file_like return _open_buffer_reader(name_or_buffer) File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 261, in init _check_seekable(buffer) File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 357, in _check_seekable raise_err_msg(["seek", "tell"], e) File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 350, in raise_err_msg raise type(e)(msg) AttributeError: 'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

我将这两行代码放到了"train"下，是不是我放的位置不对

另外，他的configs中，ljs模型有两个json文件，我用的ljs_base.json

MODEL_LIST也同步更新了

Artrajz commented 1 year ago

与train data model并列，可以参考其他的config.json，我还是贴一份在这里吧，改的是vctk_base.json

{
  "train": {
    "log_interval": 200,
    "eval_interval": 1000,
    "seed": 1234,
    "epochs": 10000,
    "learning_rate": 2e-4,
    "betas": [0.8, 0.99],
    "eps": 1e-9,
    "batch_size": 64,
    "fp16_run": true,
    "lr_decay": 0.999875,
    "segment_size": 8192,
    "init_lr_ratio": 1,
    "warmup_epochs": 0,
    "c_mel": 45,
    "c_kl": 1.0
  },
  "data": {
    "training_files":"filelists/vctk_audio_sid_text_train_filelist.txt.cleaned",
    "validation_files":"filelists/vctk_audio_sid_text_val_filelist.txt.cleaned",
    "text_cleaners":["english_cleaners2"],
    "max_wav_value": 32768.0,
    "sampling_rate": 22050,
    "filter_length": 1024,
    "hop_length": 256,
    "win_length": 1024,
    "n_mel_channels": 80,
    "mel_fmin": 0.0,
    "mel_fmax": null,
    "add_blank": true,
    "n_speakers": 109,
    "cleaned_text": true
  },
  "model": {
    "inter_channels": 192,
    "hidden_channels": 192,
    "filter_channels": 768,
    "n_heads": 2,
    "n_layers": 6,
    "kernel_size": 3,
    "p_dropout": 0.1,
    "resblock": "1",
    "resblock_kernel_sizes": [3,7,11],
    "resblock_dilation_sizes": [[1,3,5], [1,3,5], [1,3,5]],
    "upsample_rates": [8,8,2,2],
    "upsample_initial_channel": 512,
    "upsample_kernel_sizes": [16,16,4,4],
    "n_layers_q": 3,
    "use_spectral_norm": false,
    "gin_channels": 256
  },
  "speakers": ["vctk"],
  "symbols":  ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"]
}

cgnannan commented 1 year ago

与train data model并列，可以参考其他的config.json，我还是贴一份在这里吧，改的是vctk_base.json

{
  "train": {
    "log_interval": 200,
    "eval_interval": 1000,
    "seed": 1234,
    "epochs": 10000,
    "learning_rate": 2e-4,
    "betas": [0.8, 0.99],
    "eps": 1e-9,
    "batch_size": 64,
    "fp16_run": true,
    "lr_decay": 0.999875,
    "segment_size": 8192,
    "init_lr_ratio": 1,
    "warmup_epochs": 0,
    "c_mel": 45,
    "c_kl": 1.0
  },
  "data": {
    "training_files":"filelists/vctk_audio_sid_text_train_filelist.txt.cleaned",
    "validation_files":"filelists/vctk_audio_sid_text_val_filelist.txt.cleaned",
    "text_cleaners":["english_cleaners2"],
    "max_wav_value": 32768.0,
    "sampling_rate": 22050,
    "filter_length": 1024,
    "hop_length": 256,
    "win_length": 1024,
    "n_mel_channels": 80,
    "mel_fmin": 0.0,
    "mel_fmax": null,
    "add_blank": true,
    "n_speakers": 109,
    "cleaned_text": true
  },
  "model": {
    "inter_channels": 192,
    "hidden_channels": 192,
    "filter_channels": 768,
    "n_heads": 2,
    "n_layers": 6,
    "kernel_size": 3,
    "p_dropout": 0.1,
    "resblock": "1",
    "resblock_kernel_sizes": [3,7,11],
    "resblock_dilation_sizes": [[1,3,5], [1,3,5], [1,3,5]],
    "upsample_rates": [8,8,2,2],
    "upsample_initial_channel": 512,
    "upsample_kernel_sizes": [16,16,4,4],
    "n_layers_q": 3,
    "use_spectral_norm": false,
    "gin_channels": 256
  },
  "speakers": ["vctk"],
  "symbols":  ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"]
}

大佬，服务成功打开了，我发了一条语音请求，报错提示espeak没安装

(fort) E:\Fort\WechatBot\vits-simple-api>python app.py INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/ljs/ljs.pth' (iteration 0) INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/vctk/vctk.pth' (iteration 0) INFO:vits-simple-api:torch:2.0.1+cpu cuda_available:False INFO:vits-simple-api:device:cpu device.type:cpu INFO:vits-simple-api:Loaded 2 speakers INFO:apscheduler.scheduler:Added job "clean_task" to job store "default" DEBUG:apscheduler.scheduler:Looking for jobs to run DEBUG:apscheduler.scheduler:Next wakeup is due at 2023-05-16 01:16:36.449880+08:00 (in 3599.999002 seconds)

Serving Flask app 'app'
Debug mode: off INFO:werkzeug:WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
Running on all addresses (0.0.0.0)
Running on http://127.0.0.1:23456
Running on http://192.168.1.52:23456 INFO:werkzeug:Press CTRL+C to quit INFO:werkzeug:127.0.0.1 - - [16/May/2023 00:16:47] "GET /voice/speakers HTTP/1.1" 200 - INFO:vits-simple-api:[VITS] id:0 format:wav lang:auto length:1.0 noise:0.667 noisew:0.8 INFO:vits-simple-api:[VITS] len:41 text：Good evening! How can I assist you today? DEBUG:vits-simple-api:[EN]Good evening! How can I assist you today?[EN] ERROR:app:Exception on /voice [POST] Traceback (most recent call last): File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\flask\app.py", line 2528, in wsgi_app response = self.full_dispatch_request() File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\flask\app.py", line 1825, in full_dispatch_request rv = self.handle_user_exception(e) File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\flask\app.py", line 1823, in full_dispatch_request rv = self.dispatch_request() File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\flask\app.py", line 1799, in dispatch_request return self.ensure_sync(self.view_functions[rule.endpoint])(*view_args) File "E:\Fort\WechatBot\vits-simple-api\app.py", line 38, in check_api_key return func(args, **kwargs) File "E:\Fort\WechatBot\vits-simple-api\app.py", line 113, in voice_vits_api output = tts.vits_infer({"text": text, File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 435, in vits_infer audio = voice_obj.get_audio(voice, auto_break=True) File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 205, in get_audio self.get_infer_param(text=sentence, speaker_id=speaker_id, length=length, noise=noise, File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 131, in get_infer_param stn_tst = self.get_cleaned_text(text, self.hps_ms, cleaned=cleaned) File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 71, in get_cleaned_text text_norm = text_to_sequence(text, hps.symbols, hps.data.text_cleaners) File "E:\Fort\WechatBot\vits-simple-api\text__init.py", line 17, in text_to_sequence clean_text = _clean_text(text, cleaner_names) File "E:\Fort\WechatBot\vits-simple-api\text\init.py", line 31, in _clean_text text = cleaner(text) File "E:\Fort\WechatBot\vits-simple-api\text\cleaners.py", line 62, in english_cleaners2 phonemes = phonemize(text, language='en-us', backend='espeak', strip=True, preserve_punctuation=True, File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\phonemizer\phonemize.py", line 206, in phonemize phonemizer = BACKENDS[backend]( File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\phonemizer\backend\espeak\espeak.py", line 45, in init super().init( File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\phonemizer\backend\espeak\base.py", line 39, in init super().init( File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\phonemizer\backend\base.py", line 77, in init__ raise RuntimeError( # pragma: nocover RuntimeError: espeak not installed on your system

我已经装了espeak，也打开了。

Artrajz commented 1 year ago

在config.py中填写espeak的dll路径即可解决例如Windows下路径为C:\Program Files\eSpeak NG\libespeak-ng.dll

cgnannan commented 1 year ago

espeak

大佬，好像我下载的espeak版本不对，在对应目录下没有找到libespeak-ng.dll文件在网上搜了一下，没找到Win10系统的eSpeak NG的exe安装文件，

找到了python依赖库和github仓库

Artrajz commented 1 year ago

这个是win10可用的安装文件 https://github.com/espeak-ng/espeak-ng/releases/download/1.51/espeak-ng-X64.msi

cgnannan commented 1 year ago

这个是win10可用的安装文件 https://github.com/espeak-ng/espeak-ng/releases/download/1.51/espeak-ng-X64.msi

大佬，成功啦，可以接到请求并返回语音啦。试了id0和id1都是女声，是不是ljs和vctk这两个模型都是女声？想要英文男声，应该去哪里找啊？

Artrajz commented 1 year ago

看了下vctk的json中有109个speaker，你可以将json里的speaker名称任意补全，例如"speakers": ["vctk1","vctk2","vctk3"],然后再进行挑选。我试了下这个模型中id1就是男声。

cgnannan commented 1 year ago

看了下vctk的json中有109个speaker，你可以将json里的speaker名称任意补全，例如"speakers": ["vctk1","vctk2","vctk3"],然后再进行挑选。我试了下这个模型中id1就是男声。

大佬，我按照您的方案，已经将vctk1,vctk2，vctk3加载到config.json里了，运行服务后，在页面端(http://127.0.0.1:23456/voice/speakers)可以看到不同id号了

无标题

是否意味着已经成功加载vctk的3个音库了，我只要在咱们仓库的config.py里改id选出来男声就可以了。

无标题

Artrajz commented 1 year ago

是的，其实你可以通过在请求的时候指定id，就不用通过更改config.py来切换speaker

cgnannan commented 1 year ago

是的，其实你可以通过在请求的时候指定id，就不用通过更改config.py来切换speaker

懂了，谢谢大佬。

gzmasterpulse commented 1 year ago

在config.py中填写espeak的dll路径即可解决例如Windows下路径为C:\Program Files\eSpeak NG\libespeak-ng.dll

大佬，在Linux环的Docker部署如何配置和安装espeak

Artrajz commented 1 year ago

docker里我应该写了安装espeak-ng命令，你可以在docker容器终端里输入espeak-ng --version确认是否安装。linux环境下安装espeak会自动配置环境变量，所以不需要手动配置dll路径，直接使用就可以了。

Artrajz / vits-simple-api

大佬，可以吧你的模型共享出来吗？ #28