readbeyond / aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
http://www.readbeyond.it/aeneas/
GNU Affero General Public License v3.0
2.49k stars 228 forks source link

Custom tts runtime configuration ignored when `is_text_type=mplain` #187

Closed ricpacca closed 6 years ago

ricpacca commented 7 years ago

Hi, I think I found a bug while running aeneas on macOS Sierra 10.12.6 (not sure if it happens on other platforms too).

When we run the example number 34 (--example-words-festival-cache), aeneas runs the command (I expanded it):

python -m aeneas.tools.execute_task ../../../usr/local/lib/python2.7/site-packages/aeneas/tools/res/audio.mp3 ../../../usr/local/lib/python2.7/site-packages/aeneas/tools/res/words.txt "task_language=eng-USA|is_text_type=plain|os_task_file_format=aud" output/sonnet.words.aud -r="tts=festival|tts_cache=True"

And everything works as expected.

However, when we set is_text_type=mplain instead of plain in the previous command, it looks like aeneas is ignoring the custom tts choice that we made, and uses the default eSpeak even though we had specified to use Festival. This is the command, I checked the TTS used by adding a -v for verbose logs.

python -m aeneas.tools.execute_task ../../../usr/local/lib/python2.7/site-packages/aeneas/tools/res/audio.mp3 ../../../usr/local/lib/python2.7/site-packages/aeneas/tools/res/words.txt "task_language=eng-USA|is_text_type=mplain|os_task_file_format=aud" output/sonnet.words.aud -r="tts=festival|tts_cache=True"

Am I or is aeneas doing something wrong? Thank you for your amazing work with this awesome software!

readbeyond commented 7 years ago

Hi,

if you use a multi-level format like mplain, you need to select the TTS engine using

tts_l1 tts_l2 tts_l3

and, possibly,

tts_path_l1 tts_path_l2 tts_path_l3

instead of tts and tts_path, respectively.

See also:

$ python -m aeneas.tools.execute_task --help-rconf

and

$ python -m aeneas.tools.execute_task --examples-all

which will list:

python -m aeneas.tools.execute_task --example-multilevel-tts

as an example:

$ python -m aeneas.tools.execute_task --example-multilevel-tts [INFO] Running example task with arguments: Audio file: aeneas/tools/res/audio.mp3 Text file: aeneas/tools/res/mplain.txt Config string: task_language=eng-USA|is_text_type=mplain|os_task_file_format=json Sync map file: output/sonnet.mplain.json Options: -r="tts_l1=festival|tts_l2=festival|tts_l3=espeak"

Best regards,

AP

On 09/22/2017 06:38 AM, ricpacca wrote:

Hi, I think I found a bug while running aeneas on macOS Sierra 10.12.6 (not sure if it happens on other platforms too).

When we run the example number 34 (--example-words-festival-cache), aeneas runs the command:

|python2 -m aeneas.tools.execute_task ../../../usr/local/lib/python2.7/site-packages/aeneas/tools/res/audio.mp3 ../../../usr/local/lib/python2.7/site-packages/aeneas/tools/res/words.txt "task_language=eng-USA|is_text_type=plain|os_task_file_format=aud" output/sonnet.words.aud -r="tts=festival|tts_cache=True"|

And everything works as expected.

However, when we set |is_text_type=mplain| instead of |plain|, it looks like aeneas is ignoring the custom tts choice that we made, and uses the default eSpeak even though we had specified to use Festival. This is the command, I checked the TTS used by adding a |-v| for verbose logs.

|python2 -m aeneas.tools.execute_task ../../../usr/local/lib/python2.7/site-packages/aeneas/tools/res/audio.mp3 ../../../usr/local/lib/python2.7/site-packages/aeneas/tools/res/words.txt "task_language=eng-USA|is_text_type=mplain|os_task_file_format=aud" output/sonnet.words.aud -r="tts=festival|tts_cache=True"|

Am I or is aeneas doing something wrong?

ricpacca commented 7 years ago

Thank you very much! With the tts_path levels, aeneas worked with all alternative TTS engines (including Nuance and AWS, with their respective keys set), except for Festival.

Both on Ubuntu and macOS, when I try to use festival and also specify the path to text2wave, I get these logs:

[...]
[DEBU] 2017-09-24 22:23:38.682169 FESTIVALTTSWrapper: TTS engine wrote audio data to file
[DEBU] 2017-09-24 22:23:38.682205 FESTIVALTTSWrapper: Calling TTS ... done
[DEBU] 2017-09-24 22:23:38.682337 FESTIVALTTSWrapper: Reading audio data...
[DEBU] 2017-09-24 22:23:38.682494 AudioFile: Loading audio data...
[DEBU] 2017-09-24 22:23:38.682632 AudioFile: self.file_format is good => reading self.file_path directly
[CRIT] 2017-09-24 22:23:38.682781 AudioFile: Audio format not supported by scipywavread
[CRIT] 2017-09-24 22:23:38.682858 FESTIVALTTSWrapper: An unexpected error occurred while reading audio data
[CRIT] 2017-09-24 22:23:38.682880 FESTIVALTTSWrapper: Audio format not supported by scipywavread
[CRIT] 2017-09-24 22:23:38.683007 FESTIVALTTSWrapper: An unexpected error occurred in helper_function
[CRIT] 2017-09-24 22:23:38.683055 FESTIVALTTSWrapper: An unexpected error occurred in loop_function
[DEBU] 2017-09-24 22:23:38.683140 FESTIVALTTSWrapper: Synthesizing multiple via subprocess... done
[CRIT] 2017-09-24 22:23:38.683222 ExecuteTask: STEP 3 (synthesize text) FAILURE
[CRIT] 2017-09-24 22:23:38.683264 ExecuteTask: Unexpected error while executing task
[CRIT] 2017-09-24 22:23:38.683313 ExecuteTask: Both the C extension and the pure Python code failed. (Wrong arguments? Input too big?)
[ERRO] 2017-09-24 22:23:38.683373 CLI: An unexpected error occurred while executing the task:
[ERRO] An unexpected error occurred while executing the task:
[ERRO] 2017-09-24 22:23:38.683472 CLI: Unexpected error while executing task : Both the C extension and the pure Python code failed. (Wrong arguments? Input too big?)
[ERRO] Unexpected error while executing task : Both the C extension and the pure Python code failed. (Wrong arguments? Input too big?)
[DEBU] 2017-09-24 22:23:38.683665 CLI: Execution completed with code 1

I think the key issue is "Audio format not supported by scipywavread". However, I tried with many different formats, including wav 16 bits and 32 bits (online they say these formats should be supported, unlike wav 24 bits), and aeneas always gives that error message.

Is there some other configuration parameter that I should change to make Festival work? Thanks again in advance.

readbeyond commented 7 years ago

Hi,

do you find anything strange if you run:

$ python -m aeneas.tools.execute_task --example-multilevel-tts -v

?

Do you have the cfw extension compiled and installed? Do you see any difference if you add -r="cfw=True" or -r="cfw=False"?

I think that the problem is that text2wave is not generating any audio at all. For example, if the log says:

[DEBU] FESTIVALTTSWrapper: Calling with arguments '['text2wave', u'-eval', u'(language_american_english)', u'-o', u'/tmp/tmpOFczL4.wav']' [DEBU] FESTIVALTTSWrapper: Calling with text 'From fairest creatures we desire increase, That thereby beauty's rose might never die, But as the riper should by time decease, His tender heir might bear his memory:'

you can see if, from a terminal, the command:

$ echo "some text here" | text2wave -eval "(language_american_english)" -o /tmp/out.wav

produces the /tmp/out.wav file or not.

I also remember there is some issue with American vs British english in the default/packaged distributions of Festival, but I do not remember the details. Have you tried passing "eng-GBR" vs "eng-USA" as the language code? You can check the languages and voices installed in your Festival with:

$ festival festival> (language.list)

and

festival> (voice.list)

HTH,

AP

On 09/25/2017 05:33 AM, ricpacca wrote:

Thank you very much! With the tts_path levels, aeneas worked with all alternative TTS engines (including Nuance and AWS, with their respective keys set), except for Festival.

Both on Ubuntu and macOS, when I try to use festival and also specify the path to text2wave, I get these logs:

|[...] [DEBU] 2017-09-24 22:23:38.682169 FESTIVALTTSWrapper: TTS engine wrote audio data to file [DEBU] 2017-09-24 22:23:38.682205 FESTIVALTTSWrapper: Calling TTS ... done [DEBU] 2017-09-24 22:23:38.682337 FESTIVALTTSWrapper: Reading audio data... [DEBU] 2017-09-24 22:23:38.682494 AudioFile: Loading audio data... [DEBU] 2017-09-24 22:23:38.682632 AudioFile: self.file_format is good => reading self.file_path directly [CRIT] 2017-09-24 22:23:38.682781 AudioFile: Audio format not supported by scipywavread [CRIT] 2017-09-24 22:23:38.682858 FESTIVALTTSWrapper: An unexpected error occurred while reading audio data [CRIT] 2017-09-24 22:23:38.682880 FESTIVALTTSWrapper: Audio format not supported by scipywavread [CRIT] 2017-09-24 22:23:38.683007 FESTIVALTTSWrapper: An unexpected error occurred in helper_function [CRIT] 2017-09-24 22:23:38.683055 FESTIVALTTSWrapper: An unexpected error occurred in loop_function [DEBU] 2017-09-24 22:23:38.683140 FESTIVALTTSWrapper: Synthesizing multiple via subprocess... done [CRIT] 2017-09-24 22:23:38.683222 ExecuteTask: STEP 3 (synthesize text) FAILURE [CRIT] 2017-09-24 22:23:38.683264 ExecuteTask: Unexpected error while executing task [CRIT] 2017-09-24 22:23:38.683313 ExecuteTask: Both the C extension and the pure Python code failed. (Wrong arguments? Input too big?) [ERRO] 2017-09-24 22:23:38.683373 CLI: An unexpected error occurred while executing the task: [ERRO] An unexpected error occurred while executing the task: [ERRO] 2017-09-24 22:23:38.683472 CLI: Unexpected error while executing task : Both the C extension and the pure Python code failed. (Wrong arguments? Input too big?) [ERRO] Unexpected error while executing task : Both the C extension and the pure Python code failed. (Wrong arguments? Input too big?) [DEBU] 2017-09-24 22:23:38.683665 CLI: Execution completed with code 1 |

I think the key issue is "Audio format not supported by scipywavread". However, I tried with many different formats, including wav 16 bits and 32 bits (online they say these formats should be supported, unlike wav 24 bits), and aeneas always gives that error message.

Is there some other configuration parameter that I should change to make Festival work? Thanks again in advance.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/readbeyond/aeneas/issues/187#issuecomment-331769055, or mute the thread https://github.com/notifications/unsubscribe-auth/AFEodkxhFotzUXXUO1HiIiiSIrnvaHO4ks5slx70gaJpZM4PgNvg.

-- Alberto Pettarin

web: http://readbeyond.it/ web: http://www.albertopettarin.it/ twitter: http://twitter.com/acutebit/ skype: alberto_pettarin mobile: +39 340 82 18 704