facebookresearch / seamless_communication

Foundational Models for State-of-the-Art Speech and Text Translation
Other
10.94k stars 1.06k forks source link

Mutox Dataset Annotation timeline not matching #521

Open ASMIftekhar opened 3 weeks ago

ASMIftekhar commented 3 weeks ago

Hello @avidale, This may be a repeat of the open issue #486. But for many of the files, specially languages other than English and Spanish the annotated timeline is not matching with the actual audio file. Example: id: por380631. The original file link. The segmented annotations are: 1548288(25 minutes and 48 second), 1616352 (26 minutes and 56 seconds), unfortunately the original file is ~4mint long. I tested many por*** files and they all suffer from the same issue. I believe some sort of conversion went wrong. Please let me know the way to fix them

avidale commented 3 weeks ago

@mfcoria Can you suggest anything?