muflone / gespeaker

A text to speech GTK+ front-end for eSpeak and mbrola to play a text in many languages with settings for voice, pitch, volume and speed
https://www.muflone.com/gespeaker/
72 stars 26 forks source link

Playback cannot be stopped using mbrola voices #35

Closed muflone closed 3 years ago

muflone commented 11 years ago

While stopping the playback using an eSpeak voice works well, you cannot stop the playback when using an mbrola voice – the Gespeaker application just »freezes« (and even turns »dim« if the text is longer) until the text was spoken completely.

What steps will reproduce the problem?

  1. Paste a text of at least 30 words into the text area.
  2. Choose an mbrola voice, e.g. »german-mbrola-6«.
  3. Click on »Play« and try to cancel playback while it speaks by clicking on »Stop« – you will notice that the »Play« button doesn't even change its caption to »Stop«.

What is the expected output? What do I see instead? I'd have expected the playback to be stoppable – just like it is stoppable at all times when using an eSpeak voice, e.g. »german«. But instead, the Gespeaker application freezes (hangs) and turns dim after a while, still speaking.

What versions do I use? On what operating system?

Thanks in advance for checking this out, fixing this annoying bug, pointing to the problem or providing a workaround. Please let me know, if you need more information.

Regards,

Dennis

muflone commented 11 years ago

Hi!

Yes, I can confirm this; using ubuntu 10.10, on a 32-bit installation. amd-quad core machine. I am using gespeaker-0.8.1 from this site, same espeak and mbrola versions as Dennis.

so long hank

muflone commented 11 years ago

Thanks hank for the confirmation,

as you are using the 0.8.1 version of Gespeaker from this site and having the same problem, I assume the developers/bug maintainers can change the status of this bug fromn »New« to »Confirmed« or something similar now.

Regards,

Dennis

muflone commented 11 years ago

I'm still using Ubuntu Karmic 9.10 and have the same problem. Used to work fine though until I recently built/updated espeak to version 1.44.05. The newer version of espeak has native access to mbrola. It used to be necessary to pipe espeak to mbrola.

I looked at the python code for gespeaker some and I'm pretty sure the problem lies in this area. I don't understand python well enough though to make the changes though.

It looks like mbrola support was added to espeak in version 1.44.01. Suspect some sort of checking for espeak versions and changing program flow in gespeaker is needed. Or you might be able to back up to an older version of espeak if it bothers you a lot...

Here are the pertinent parts from the espeak changelog:

espeak 1.44.05

Fix error in big-endian data conversion program, producing bad data.

Make geminated voiced stops (eg. [bb] ) longer at fast speeds.

Provide conditional compilation of the mbrola interface, define macro INCLUDE_MBROLA in speech.h

Mbrola: also look for mbrola voices in /usr/share/mbrola/voices

Pad TUNES and frame_t structures to a multiple of 4 bytes.

lang=da, Don't weaken unvoiced stops before pause. lang=el, Remove final unstressed [a] if the next word starts with [a]. lang=pt, Change final [U] to [w] if next word starts with a vowel.

espeak 1.44.03

Fixes: Lang=el, mk. Was speaking words as individual letters. Lang=pl. Fix prounciation of 'ć' and 'ci'. Fix crash in big-endian data conversion program. Fix problem where changing voices reduces the speaking rate, at fast rates.

speak_lib.h: add macro definitions for minimum, maximum, and normal speaking rate values.

espeak 1.44.01

Fix crash with very long numbers. Speak very long numbers as individual digits.

Unpronouncable word check: Rules for unpronouncable initial letter sequences can now be defined in *_rules files.

The unpronouncable word check now stops when an apostrophe is found.

Phoneme definitions: Optional second parameter to FMT() statement specifies a percentage amplitude.

Added "ipa" statement to specify the IPA name for a phoneme if the default translation is not correct.

Add phoneme "equivalents" tables, so that words can be spoken with foreign (eg English) prounuciation rules, but using local phonemes.

New attributes: flag1, flag2, flag3

New attribute: nopause. Prevents the insertion of a short pause when this phoneme starts a word which follows a vowel.

New conditions: isFlag1, isFlag2, isFlag3, isSibilant.

New statement: InsertPhoneme()

Phonemes: improve syllablic [m-] [n-] [N-]

Mbrola: Command-line espeak and the libespeak library now call the mbrola program directly, rather than producing phoneme text which must be piped into mbrola.

Added --pho command-line option to generate mbrola phoneme information (.pho data).

Phoneme output: Add --ipa command-line option to produce phonemes names using the International Phonetic Alphabet.

Indicate language changes during phoneme output with: (en) (fr) etc.

-X command-line option: Show the matching of multiple-word entries in *_list files.

Speak sequences of letters and dots as individual letters and don't speak 'dot' (eg. "u.s.a").

Don't speak punctuation characters inside

Don't speak "dot" if an ellipsis is followed by a dot.

Vowelcharts: Show the positions for multiple FMT() statements in a vowel phoneme definition.

*_rules: add attributes $p_alt $p_alt2 $p_alt3, $w_alt $w_alt2 $w_alt3

*_list: add attributes: $sentence, $atstart

klatt synthesizer: implement echo (defined in voice files).

espeakedit: Prosody display: Show stressed and secondary-stress syllables.

Remember window size and position.

Change the frame-length field from Spin Control to Text Control to allow better access from screen-readers.

Intonation: New file, 'phsource/intonation' to define 'tunes' which can be used from voice files.

espeakedit: add Compile -> Intonation data

Intonation: change the internal pitch unit to give finer control, and align with the values displayed in the espeakedit Prosody window.

Speed: Increase range to 80 to 450, with default=175. Improve speaking at high speeds.

Language options: add an option to the Regressive Voicing option to de-voice the final consonant of words.

lang=ta, hi. Letter-names for combining vowel characters are distinguished from stand-alone vowel characters by adding an initial click sound.

lang=en: Reduce consecutive unstressed syllables to 'diminished' stress, only in unstressed words. lang=de: Change 'r' phoneme. lang=es: Improve the rules for reducing 'b', 'd', 'g' to approximants [B] {D] [Q]. Language improvements include: Danish, Dutch.

espeak 1.43.03 (bug fixes)

Fix crash when embedded control codes are followed by numbers of 5 or more digits. Fix lang=hu, First character of an abbreviation is missed after an ordinal number (eg."2. cd") Fix XML tag not recognized after "..." when announce punctuation is enabled. Fix lang=zh-yue, 'p' 't' 'k' after a vowel give a long pause. Fix lang=ru, "o" missing in unstressed syllables.

espeak 1.43.02

Language improvements including Danish. Fix: " 50000" with leading spaces was spoken as "50". Don't consider multiple spaces as a thousands separator (eg. "2 000"). Fixed phoneme [n^] for klatt synthesizer. Lang=Hungarian, don't allow dot as thousands separator.

espeak 1.43...

muflone commented 11 years ago

could you please provide a sample text which produces the defect?

muflone commented 11 years ago

Hi!

Any german file will do; I'll append one. (Actually I just noticed that Gespaker will not open all text files; I just tried one produced by ocre, and it didn't work out -- but thats a different story...)

The feezing occures whenever a text is read using (german) mbrola voices, a text written, or one imported via open -> file.

I didn't check with other languages, though.

(OT: Glad you liked my translation/po file ;-))

so long hank

muflone commented 11 years ago

E.g. try this:

Dies ist ein Test. Dies ist ein Test. Dies ist ein Test. Dies ist ein Test. Dies ist ein Test. Dies ist ein Test. Dies ist ein Test. Dies ist ein Test. Dies ist ein Test. Dies ist ein Test. Ende des Tests.

Langage: german-mbrola-6 Pitch: 50 Volume: 200 Speed: 160 Delay: 5

Then click on play. Try to stop playback while listening. The Gespeaker application freezes until ALL words are spoken.

muflone commented 11 years ago

I'm unable to reproduce it with my machine using both texts and both german-mbrola-6 and german-mbrola-7.

I'll try againt with some other machines

muflone commented 11 years ago

problem acknowledged

This is an issue related to espeak 1.44 (1.43.48) which doesn't write phonemes anymore to stdout to let mbrola to play them as indicated in the official website: http://espeak.sourceforge.net/test/latest.html http://espeak.sourceforge.net/mbrola.html

the new espeak now plays the audio on by itself, breaking the compatibility with older softwares.

A temporary fix could be editing this line in /usr/share/gespeaker/src/Settings.py argsEspeak = '-a %v -p %p -s %s -g %d -v %l -f %f --pho'

I'll fix it in the next release

Thank you for your report

muflone commented 11 years ago

Hi!

Thanks for the fix - it works fine! :-)

so long hank