SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2
MIT License
12.51k stars 1.05k forks source link

fix word timestamps for batched inference #920

Closed MahmoudAshraf97 closed 3 months ago

MahmoudAshraf97 commented 3 months ago

it was caused by wrong num_frames argument when finding the alignments, it was assumed that inferring it from encoder output size was sufficient but turned out to cause issues such as #919 when the actual segment size is much less that the inferred size

MahmoudAshraf97 commented 3 months ago

closed in favour of #921