k2-fsa / sherpa-onnx

Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter, Object Pascal, Lazarus, Rust
https://k2-fsa.github.io/sherpa/onnx/index.html
Apache License 2.0
3.71k stars 430 forks source link

Android TTS: Some ideas for packaging tts engine and voices/models #852

Open domasofan opened 6 months ago

domasofan commented 6 months ago

Hi all,

Just tested the tts engine on android 14 on a phone with an 64 bit processor. I also tested tts with the talkback screenreader. It works very well and is pretty responsive for a neural tts.

Maybe i have some interesting ideas to package the voices and the tts engine. It might be a good idea to package the tts engine without the voices but include the espeak-ng-data directory so it doesn't need to be installed multiple times with the voices. I also would remove the tts app from the voices packages and just keep it in the tts engine. As well i would remove all other files and just keep the tokens file, the onnx file and maybe other needed files which are just for the individual voice.

Can multiple voices be installed? If yes it might be interesting to add a voices list to the gui of the tts engine app to set the currently used/default voice. Additionally renaming the voice packages' display name to the various models so when removing it might be easier to remove the right voices and keep the wanted ones.

What do you think about that?

I can create single issues if you want later with the ideas. This issue might be interesting for brainstorming a little bit.

Greetings, Simon

paolo-caroni commented 6 months ago

First to open a new issue please search similar issues. read #569

domasofan commented 6 months ago

Hi @paolo-caroni,

👍 Sorry, haven't seen that one.

Greetings, Simon