MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
3.43k stars 288 forks source link

Only part of audio transcribed #151

Open NasonZ opened 9 months ago

NasonZ commented 9 months ago

I have an hour long meeting which I would like to transcribe. Looking at the text and srt outputted, I can see that only the first 11 minutes have been transcribed:

112
00:10:57,809 --> 00:11:01,434
Speaker 0: Back on the left hand side below groups we then have our devices.

113
00:11:02,916 --> 00:11:05,279
Speaker 0: This is where we can register your device onto the system.

114
00:11:07,041 --> 00:11:07,542
Speaker 0: If we don't.   #cuts off halfway through this sentence

I have a few questions regarding this issue:

What settings do I need to adjust to transcribe the entire meeting? Is there a known limit to the length of audio that can be transcribed? If yes, what is the limiting factor?

Psarpei commented 6 months ago

I have same problems but already with 5mins of audio, do you fixed this?