Open GithubAnon0000 opened 2 months ago
I'll have to learn more about how the model had been trained (and how piper uses the model), since I came to the conclusion that the model itself is somehow doing that.
It happens with normal words and sentences too. The same sentence is not pronounced the same way, even though espeaks dictionaries are quit deterministic. Running piper
with --debug
actually shows the phonemes (just like espeak-ng --ipa
). They are identical.
Judging on that, the model probably has some sort of variance for some reason. I'll have to learn more about it first but I believe the way the model had been trained has something to do with it, since AI tends to do things like that (and you trained it using coqui
). Maybe it's more or less easily fixable (since I'd prefer deterministic output if possible). It's low priority for me though.
First of all thank you for your great and detailed description 👍.
One idea might be to clean the text before tts processing using e.g. https://github.com/repodiac/german_transliterate . Is this $dot
at the end of the adjusted dictionary required? Maybe that's a reason for the break, which is meant to be after a dot
character.
I tried your sentence on my huggingface spaces.
Piper space: Ich habe Fragen bezüglich. Ihrer Rückmeldung.
has a break after bezüglich.
. Ich habe Fragen bezüglich. Ihrer Rückmeldung.
is sounding good, without a break as the espeak speech flow.
My trained Coqui models have (as expected that break) too when a dot after bezüglich.
is added.
So i'm not sure if that $dot at the end of the adjusted dictionary has something to do with that.
My trained Coqui models have (as expected that break) too when a dot after
bezüglich.
is added.
Yes, but they shouldn't. At least if you use the actual abbreviation like outlined in the "steps to reprocude" parts. → Not "…bezüglich. …", but "… bzgl. …".
Is this
$dot
at the end of the adjusted dictionary required? Maybe that's a reason for the break, which is meant to be after adot
character.
The $dot
basically says that the word "bzgl." has a dot but isn't supposed to be spoken with a break after that dot. It works fine with espeak, but not with piper and your model. I'm now guessing that the training (with ai) never learned about abbreviations and thus always assumes it should read a break after a dot (which in case of "bzgl.", it shouldn't).
One idea might be to clean the text before tts processing
Yes, that's what I'm currently doing (although with my own bash script). It works, since all I have to do is changing abbreviations like "bzgl.", "z. B." ect. to their long form ("bezüglich", "zum Beispiel"). Since this works, this issue is low priority for me as stated above. But if I could adjust the model or dictionary files someone so that preprocessing becomes redundant, this would be great.
Thanks for your time and looking into it!
Hello again!
I am using piper with the thorsten (high) voice. I wanted to see if it's possible to pronounce "bzgl." correctly without having to use a separate string that says "bezüglich". But it always speaks a long pause with your voice, where with espeak it works fine.
Maybe you've got an idea?
Steps to reproduce
bzgl b@ts'y:klIC $dot
piper
uses:sudo espeak-ng --compile=de && cp /usr/lib/x86_64-linux-gnu/espeak-ng-data/de_dict ../TTS/espeak-ng-data/de_dict
echo "Ich habe Fragen bzgl. Ihrer Rückmeldung." | ./piper --model ./de_DE-thorsten-high.onnx --output-file ../OUTPUT/text.wav
for the audio generated with your voice model.espeak-ng "Ich habe Fragen bzgl. Ihrer Rückmeldung." -v German --stdout > ../OUTPUT/text_espeak.wav
to generate the same audio with espeak.The voice obviously is different but so is the pronounciation. A workaround is to just use "bezüglich" instead of "bzgl.".
Expected Behavior
The pause after "bzgl." shouldn't be there.
Actual behavior
The pause is there.
Other things tried
According to espeak dictionary docs I tried the following alternatives one by one:
None where successfull though with the thorsten voice. Adding a dot after bzgl made it worse, even in espeak:
Version info
piper: 1.2.0 OS: Debian oldstable (gnome 3.38.5, X11) python: 3.9.2