k2-fsa / sherpa-onnx

Speech-to-text, text-to-speech, speaker recognition, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
https://k2-fsa.github.io/sherpa/onnx/index.html
Apache License 2.0
3.13k stars 364 forks source link

TTS:调用模型出错 #1206

Closed kyn817046 closed 1 month ago

kyn817046 commented 1 month ago

image 配置文件代码: std::string vits_model = "./vits-melo-tts-zh_en/model.onnx"; std::string vits_lexicon = "./vits-melo-tts-zh_en/lexicon.txt"; std::string vits_tokens = "./vits-melo-tts-zh_en/tokens.txt"; std::string tts_rule_fsts = "./vits-melo-tts-zh_en/date.fst,./vits-melo-tts-zh_en/number.fst"; std::string vits_dict_dir = "./vits-melo-tts-zh_en/dict";

bool is_ok = true;
if (!Exists(vits_model))
{
    std::string msg = vits_model + " does not exist!";
    std::cout << msg << std::endl;
    is_ok = false;
}
if (!Exists(vits_lexicon))
{
    std::string msg = vits_lexicon + " does not exist!";
    std::cout << msg << std::endl;
    is_ok = false;
}
if (!Exists(vits_tokens))
{
    std::string msg = vits_tokens + " does not exist!";
    std::cout << msg << std::endl;
    is_ok = false;
}

SherpaOnnxOfflineTtsConfig config;

config.model.debug = 1;
config.model.num_threads = 1;
config.model.provider = "cpu";
config.model.vits.lexicon = vits_lexicon.c_str();
config.model.vits.model = vits_model.c_str();
config.model.vits.tokens = vits_tokens.c_str();
config.model.vits.noise_scale = 0.667f;
config.model.vits.noise_scale_w = 0.8f;
config.model.vits.length_scale = 1.0f;  
config.max_num_sentences = 1;//
//config.rule_fars = "./rule.far";//
config.rule_fsts = tts_rule_fsts.c_str();//
//config.model.vits.data_dir = "";
config.model.vits.dict_dir = vits_dict_dir.c_str();
if(is_ok)
    tts = SherpaOnnxCreateOfflineTts(&config);

报错: D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx\c-api\c-api.cc:SherpaOnnxCreateOfflineTts:940 OfflineTtsConfig(model=OfflineTts ModelConfig(vits=OfflineTtsVitsModelConfig(model="./vits-melo-tts-zh_en/model.onnx", lexicon="./vits-melo-tts-zh_en/lexi con.txt", tokens="./vits-melo-tts-zh_en/tokens.txt", data_dir="HH̉HhD$Pt=", dict_dir="./vits-melo-tts-zh_en/dict", noi se_scale=0.667, noise_scale_w=0.8, length_scale=1), num_threads=1, debug=True, provider="cpu"), rule_fsts="./vits-melo-t ts-zh_en/date.fst,./vits-melo-tts-zh_en/number.fst", rule_fars="HXH\WH0HD$HHD$ H4", max_num_sentences=1)

D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx\csrc\offline-tts.cc:Validate:57 Rule far 'HXH\WH0HD$HHD$ H4' does not exist. D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx\c-api\c-api.cc:SherpaOnnxCreateOfflineTts:944 Errors in config Failed to initialize TTS engine.

模型文件中没有rule.far文件,为什么会报这个错呢?

csukuangfj commented 1 month ago
SherpaOnnxOfflineTtsConfig config;
memset(&config, 0, sizeof(config);

Please add memset to zero initialize the struct.

Please see also https://github.com/k2-fsa/sherpa-onnx/blob/35c1b4a7a9376c6bb80ac461e1b2169be563f908/c-api-examples/offline-tts-c-api.c#L145

kyn817046 commented 1 month ago

Thank you very much! Is there any explanation of parameters in TTS? I have read the document of this information is relatively small, do I not find the place?

csukuangfj commented 1 month ago

If you use the C++ binary, e.g.,

./build/bin/sherpa-onnx-offline-tts --help

you would see the help message of each member variable of the struct.

kyn817046 commented 1 month ago

log:

D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx\c-api\c-api.cc:SherpaOnnxCreateOfflineTts:940 OfflineTtsConfig(model=OfflineTts
ModelConfig(vits=OfflineTtsVitsModelConfig(model="./vits-melo-tts-zh_en/model.onnx", lexicon="./vits-melo-tts-zh_en/lexi
con.txt", tokens="./vits-melo-tts-zh_en/tokens.txt", data_dir="", dict_dir="./vits-melo-tts-zh_en/dict", noise_scale=0.6
67, noise_scale_w=0.8, length_scale=1), num_threads=1, debug=True, provider="cpu"), rule_fsts="./vits-melo-tts-zh_en/dat
e.fst,./vits-melo-tts-zh_en/number.fst", rule_fars="", max_num_sentences=2)

D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx\csrc\offline-tts-vits-model.cc:Init:79 ---vits model---
version=2
model_type=melo-vits
jieba=1
url=https://github.com/myshell-ai/MeloTTS
comment=melo
language=Chinese + English
add_blank=1
n_speakers=1
sample_rate=44100
bert_dim=1024
ja_bert_dim=768
speaker_id=1
lang_id=3
tone_start=0
license=MIT license
description=MeloTTS is a high-quality multi-lingual text-to-speech library by MyShell.ai
----------input names----------
0 x
1 x_lengths
2 tones
3 sid
4 noise_scale
5 length_scale
6 noise_scale_w
----------output names----------
0 y

D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx/csrc/offline-tts-vits-impl.h:OfflineTtsVitsImpl:48 rule fst: ./vits-melo-tts-zh
_en/date.fst
D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx/csrc/offline-tts-vits-impl.h:OfflineTtsVitsImpl:48 rule fst: ./vits-melo-tts-zh
_en/number.fst
D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx/csrc/offline-tts-vits-impl.h:Generate:165 Raw text: hello world,你好世界!This i
s a test of the speech synthesis system.
D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx/csrc/offline-tts-vits-impl.h:Generate:172 After normalizing: hello world,你好世
界!This is a test of the speech synthesis system.
D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx/csrc/offline-tts-vits-impl.h:Generate:172 After normalizing: hello world,你好世
界!This is a test of the speech synthesis system.
D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx\csrc\jieba-lexicon.cc:ConvertTextToTokenIds:77 input text: hello world,你好世界
!This is a test of the speech synthesis system.
D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx\csrc\jieba-lexicon.cc:ConvertTextToTokenIds:78 after replacing punctuations: he
llo world,你好世界!This is a test of the speech synthesis system。
D:\a\sherpa-onnx\sherpa-onnx\sherpa-onnx\csrc\jieba-lexicon.cc:ConvertTextToTokenIds:87 after jieba processing: hello_ _
world_,_你好_世界_!_This_ _is_ _a_ _test_ _of_ _the_ _speech_ _synthesis_ _system_.

code: tts = SherpaOnnxCreateOfflineTts(&config);

if (tts == nullptr) {
    std::cerr << "Failed to initialize TTS engine." << std::endl;
    return;
}

const std::string text = "hello world,你好世界!This is a test of the speech synthesis system.";
int32_t sid = 0;
const char* filename =_strdup("./x64/Release/generated.wav");

const SherpaOnnxGeneratedAudio* audio = SherpaOnnxOfflineTtsGenerate(tts, text.c_str(), sid, 1.0);

image I followed the sample code and ran it, reporting errors as above. Do you have time to take a look?

csukuangfj commented 1 month ago

Please update your sherpa-onnx to the latest master, re-build sherpa-onnx, and re-try.

csukuangfj commented 1 month ago

@kyn817046

Does it work for you now?

kyn817046 commented 1 month ago

I downloaded and compiled the latest project and still reported this error

csukuangfj commented 1 month ago

Please show

git log

to verify the version of your sherpa-onnx and please tell us how you build sherpa-onnx and post the build logs.

kyn817046 commented 1 month ago

Thank you very much! The latest version can run.

csukuangfj commented 1 month ago

Thank you very much! The latest version can run.

Great!