muflone / gespeaker

A text to speech GTK+ front-end for eSpeak and mbrola to play a text in many languages with settings for voice, pitch, volume and speed
https://www.muflone.com/gespeaker/
72 stars 26 forks source link

What commands is Gespeaker using? Espeak with mbrola cannot produce the same output. #65

Closed cptX closed 3 years ago

cptX commented 6 years ago

Hi, I discovered Gespeaker today and I really like it! Combining it with mbrola engine the sound is much better than espeak on it's own voices. Now I'm trying to make a Text-to-Speak script to play from a selection (using xsel). The problem is that when I run the command espeak -v mb-gr2 "Some greek text here" -p 99 - s 200 I get a completely different sound than Gespeaker. For sure the mbrola voice/engine is working because it sounds different than running the command espeak -v el, but for some reason the Gespeaker sound is more native and a bit higher pitch. Also the parameter p has a different value/result than in Gespeaker! Why is Gespeaker able to do a better job with the same mb-gr2 voice compared to the simpe espeak command? What commands can I use to reproduce the same sound as Gespeaker?

When running gespeaker from command line I get the following debug messages ['/usr/bin/espeak', '-a', '100', '-p', '50', '-s', '200', '-g', '1', '-v', 'mb-gr2', '-f', '/tmp/gespeakersdNc68', '--pho'] ['/usr/bin/mbrola', '-v', '1.0', '-e', '/usr/share/mbrola/gr2/gr2', '-', '/tmp/gespeaker.wav'] ['aplay']

which shows me that aplay is also involved. I tried to combine espeak with aplay but still cannot get the result of gespeaker.

muflone commented 6 years ago

Hi

the commands used by Gespeaker are those you can see from the command line output. Basically in your example:

/usr/bin/espeak -a 100 -p 50 -s 200 -g 1 -v mb-gr2 -f /tmp/gespeakersdNc68 --pho | /usr/bin/mbrola -v 1.0 -e /usr/share/mbrola/gr2/gr2 - /tmp/gespeaker.wav

At the last you can play the gespeaker.wav file using aplay, paplay or whatever you prefer.

cptX commented 6 years ago

Hi Mulfone, thanks for your quick answer. After some trial and error I managed to produce exactly the same output as gespeaker. Critical here was the parameter -r22000 of aplay as it was producing different voice pitch on different values...

xsel | /usr/bin/espeak -a 100 -p 50 -s 150 -g 2 -v mb-gr2 --pho | /usr/bin/mbrola -v 1.0 -e /usr/share/mbrola/gr2/gr2 - - | aplay -r22000 -fS16

cptX commented 6 years ago

And now can you please explain the following:

espeak accepts mbrola voices and can speak itself. Why do I need this sequence of commands to produce this sound instead of just using espeak with mbrola voice directly?

For example when using the following command I get a very low pitch voice: xsel | espeak -a 100 -p 50 -s 150 -g 2 -v mb-gr2

Even if I set the -p parameter to 99 (max) I still get lower pitch than by using aplay or gespeaker. Is the key here the rate of aplay? Is there any possibility to make espeak use a different output bitrate?