readbeyond / aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
http://www.readbeyond.it/aeneas/
GNU Affero General Public License v3.0
2.44k stars 218 forks source link

Early exit with no error on SMIL generation #298

Open sharinganthief opened 1 year ago

sharinganthief commented 1 year ago

test files here for the next 6 days - https://filebin.net/4peynbgh4armtrqc

the command: python -m aeneas.tools.execute_task .\audio\2.mp3 .\sync_text\2.xhtml "task_language=eng|os_task_file_format=smil|os_task_file_smil_audio_ref=../audio/2.mp3|os_task_file_smil_page_ref=../text/2.xhtml|is_text_type=unparsed|is_text_unparsed_id_regex=f[0-9]+|is_text_unparsed_id_sort=numeric" .\smil\2.smil -vv the results: image

i set up and ran for the first chapter and everything went smooth with: n m delta: 28836 25625 3000

the second chapter results in: n m delta: 39437 33752 3000

added a couple log statements in dtw.py and have determined this is the line where it breaks best_path = aeneas.cdtw.cdtw.compute_best_path( mfcc1, mfcc2, delta ) self.log([u"computing done"])

sharinganthief commented 1 year ago

in further testing, I have found the upper limit to be about 20 min of audio, anything under that seems to work just fine

sharinganthief commented 1 year ago

test files - https://drive.google.com/file/d/1QkUJ1ybR79Ts-AmyIzLZ0WUhTu7W4g4E/view?usp=share_link

included "original" at 23 min or so and test file at 10 min, 10 works fine, original fails silently event with -r="dtw_margin=120", -r="dtw_margin=180", -r="dtw_margin=360" as found in another issue