readbeyond / aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
http://www.readbeyond.it/aeneas/
GNU Affero General Public License v3.0
2.44k stars 218 forks source link

Windows Desktop Observations #281

Open hoodji opened 2 years ago

hoodji commented 2 years ago

Hi,

I know that this version of Aeneas is no longer being supported, so am posting here to help others who may stumble on issues I have detected, and had to overcome outside of the aeneas code. I am using .wav files and .txt files. First issue is an array bounds one, and the only solution I have found is to restrict the audio to less than 29 minutes, which means chopping up and re-joining my files, no big deal. Second issue is more of a pain, on occasions, and I cannot figure out what the common factor is, aeneas duplicates, and sometimes triplicates text fragments one after another, with different start and stop times. To overcome this I have had to match the original text fragments with the aeneas produced ones and remove the extras and re-calculate times .. pain but works.