readbeyond / lachesis

lachesis automates the segmentation of a transcript into closed captions
GNU Affero General Public License v3.0
32 stars 5 forks source link

long audio #8

Closed naarkhoo closed 4 years ago

naarkhoo commented 5 years ago

sorry if the question is basic - I am very new in the field. how about if I have a long audio files (on hour) that is transcribed. I can use Lachesis to CC it but what would happen to the audio file ? I don't think I can align multiple text into one audio ? is there anyway to split the audio accordingly ?

pettarin commented 5 years ago

lachesis works on text only, there is no notion of audio.

If you want to perform forced alignment (i.e., align audio and text), you need to first use lachesis to get the text segments, and then align them with your audio separately. Clearly this is suboptimal w.r.t. considering both text and audio for segmenting (since lachesis only works on text), but this is a design choice.

(Plus, the README says clearly: DO NOT USE THIS PACKAGE IN PRODUCTION UNTIL IT REACHES v1.0.0 !!! )

Best regards,

Alberto Pettarin

On 8/20/19 6:04 PM, Alireza Kashani wrote:

sorry if the question is basic - I am very new in the field. how about if I have a long audio files (on hour) that is transcribed. I can use Lachesis to CC it but what would happen to the audio file ? I don't think I can align multiple text into one audio ? is there anyway to split the audio accordingly ?