readbeyond / aeneas

aeneas is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment)
http://www.readbeyond.it/aeneas/
GNU Affero General Public License v3.0
2.45k stars 218 forks source link

How to set max segment length? #253

Closed vitomargiotta closed 3 years ago

vitomargiotta commented 4 years ago

Hi, when aligning Italian text the output segments are oftentimes over 1min long, while I would need them to be on average way shorter (10secs or so). Is it possible to set the max length of the segment (somehow, either by words count or by seconds or else is fine)?

Thank you for building such a great software and wish you a great day! Vito

sventech commented 4 years ago

You can use the subtitles mode and just have lines (sentences, for example) separated by blank lines -- that will manually break up audio into those specific segments:

python3 -m aeneas.tools.execute_task recording_audio.ogg transcript_text.txt "task_language=eng|is_text_type=subtitles|os_task_file_format=srt" subtitles_text.srt

readbeyond commented 3 years ago

@vitomargiotta aeneas does not chunk the text for you. You need to pre-cut it as you like/need: there are several libraries out there to (try to) do that in a linguistically-sound manner; some even considering constraints typical of close captioning applications (e.g., max num of chars per line, max 2 lines, etc.). Once you have the fragments, follow @sventech suggestion of using the "subtitles" input format.

jhancock4d commented 2 years ago

@readbeyond Do you have any examples of these kinds of apps? Especially open source?