k2-fsa / sherpa-onnx

Speech-to-text, text-to-speech, speaker recognition, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
https://k2-fsa.github.io/sherpa/onnx/index.html
Apache License 2.0
3.25k stars 380 forks source link

Multi-language support issue with Sherpa ONNX conversion (on android device) #995

Open Muzaffar-x opened 3 months ago

Muzaffar-x commented 3 months ago

I have a model based on Piper, which I converted using Sherpa ONNX. During the conversion, I specified the language as "uz" (Uzbek). When running the ONNX model through an Android APK, the espeak-ng language model is used. Within espeak, it is possible to specify words that can inherit rules from other languages, such as "en" (English). This works correctly when tested directly with espeak-ng, but when using the model in the application, it only recognizes the "uz" rules specified during the Sherpa ONNX conversion. photo_2024-06-13_10-34-10

It seems that Sherpa ONNX is not utilizing multi-language support as expected. Is it possible to configure Sherpa ONNX to use multiple languages, or is there a limitation that needs to be addressed?

Steps to Reproduce:

1. Convert a model using Sherpa ONNX with the language set to "uz".

2. Use the ONNX model in an Android APK with espeak-ng.

3. Try to use words that inherit rules from other languages (e.g., "en").

4. Observe that only "uz" rules are applied.

Expected Behavior: The model should support multiple languages and apply the appropriate rules from specified languages in espeak-ng.

Actual Behavior: The model only recognizes rules from the language specified during conversion ("uz") and does not apply inherited rules from other languages.

Environment:

Additional Context: Any guidance on how to enable multi-language support in Sherpa ONNX or any workarounds would be greatly appreciated. Thank you!

csukuangfj commented 3 months ago

Could you point us to the doc or code about using the multilingual function of espeak-ng?

vodiylik commented 3 months ago

Hi @csukuangfj

Here documentation for generating phonemes from text using rulesets of different language.

csukuangfj commented 3 months ago

Hi @csukuangfj

Here documentation for generating phonemes from text using rulesets of different language.

Is there any API, i.e., function that I can use in code, to do that?

vodiylik commented 3 months ago

Hi @csukuangfj Here documentation for generating phonemes from text using rulesets of different language.

Is there any API, i.e., function that I can use in code, to do that?

I looked through the source code, but I couldn't find an API where multilingual was used. When the result is obtained with espeak-ng --ipa parameter, there are results like `(en), (fr),... in the result, is it possible to process it based on this?

csukuangfj commented 3 months ago

When the result is obtained with espeak-ng --ipa parameter, there are results like `(en), (fr),...

Is there an API to do that? If yes, I think it is possible to change the code in sherpa-onnx to support that.

paolo-caroni commented 1 week ago

I think the problem is that sherpa-onnx don't have multilengual support on android, as pointed on #569. Android TTS API upport multilanguage and support also using a specified language, so one app can choose language and talk with different voices (male/female, italian/english/chineese/russian/put a language here). @Muzaffar-x maybe try the experimental multilenguage support by Jing332 https://github.com/jing332/SherpaOnnxTtsEngineAndroid