SYSTRAN / faster-whisper

Faster Whisper transcription with CTranslate2
MIT License
12.14k stars 1.02k forks source link

End of sequence token problem. #726

Open ngcheeyuan opened 8 months ago

ngcheeyuan commented 8 months ago

I was transcribing an audio file that was about 65 seconds long. However the model kept generating text until about 83s (based on time stamp).

Is this an issue with the 30s chunking and the last 5 seconds being padded to fill up the 30s?

Is there a way I can solve this (other than post processing cutting out text that exceeds the duration of the audio file) .

Thanks.

trungkienbkhn commented 7 months ago

@ngcheeyuan , hello. Can you attach your audio and show the transcription log ?