Open mush42 opened 7 months ago
I'd like to add that the tashkeel model shipped with piper-phonemize
is not good at all (although I helped to implement it). The library I'm developing works better since it has been trained with lots of data from modern Arabic.
Hi
Thanks for your awesome work!
I tried the Arabic TTS voice (Kareem), and I noticed that an important text preprocessing step is missing.
Arabic text is usually unvocalized (aka diacritized). For the purposes of intelligibility the text must be vocalized before phonemization. Usually, a lightweight neural network is used for vocalization. This important preprocessing step is missing from
sherpa-onnx
.Piper's Arabic voice has been trained with vocalized text. I say this because I prepared and audited the data used for training that voice.
Fortunately, I'm developing a package for Arabic-text vocalization named Libtashkeel.
It is written in Rust, has a C API, is developed to be cross platform, and the model is embedded in the library itself. Here's the library running on the browser via WASM
The library has a single entry point function that takes a string and outputs a string.
I cann't contribute a PR since I'm not familiar with C++, but I can help to integrate
libtashkeel
from the rust side via any means necessary.Best Musharraf