Closed vitomargiotta closed 3 years ago
You can use the subtitles mode and just have lines (sentences, for example) separated by blank lines -- that will manually break up audio into those specific segments:
python3 -m aeneas.tools.execute_task recording_audio.ogg transcript_text.txt "task_language=eng|is_text_type=subtitles|os_task_file_format=srt" subtitles_text.srt
@vitomargiotta aeneas does not chunk the text for you. You need to pre-cut it as you like/need: there are several libraries out there to (try to) do that in a linguistically-sound manner; some even considering constraints typical of close captioning applications (e.g., max num of chars per line, max 2 lines, etc.). Once you have the fragments, follow @sventech suggestion of using the "subtitles" input format.
@readbeyond Do you have any examples of these kinds of apps? Especially open source?
Hi, when aligning Italian text the output segments are oftentimes over 1min long, while I would need them to be on average way shorter (10secs or so). Is it possible to set the max length of the segment (somehow, either by words count or by seconds or else is fine)?
Thank you for building such a great software and wish you a great day! Vito