ggerganov / whisper.cpp

Port of OpenAI's Whisper model in C/C++
MIT License
35.38k stars 3.61k forks source link

Realtime Transcription with no_context=false Repeats Old Data After Several Minutes #2510

Open edisonzf2020 opened 6 days ago

edisonzf2020 commented 6 days ago

I'm experiencing an issue with whisper.cpp's realtime transcription functionality when the no_context parameter is set to false. After several minutes of capturing microphone input and transcribing in realtime, the output begins to repeat large chunks of older transcribed data. This behavior does not occur when no_context is set to true.

Steps to Reproduce:

  1. Build whisper.cpp v1.7.1 with realtime transcription enabled.
  2. Run whisper.cpp with no_context set to false, capturing microphone input.
  3. Observe the output for several minutes (e.g., 5-10 minutes).

Expected Behavior:

Continuous, accurate transcription of the microphone input without repeating old data.

Actual Behavior:

After several minutes, the output starts to include repeated segments of previously transcribed text.

Additional Information:

Setting no_context to true resolves the issue, but I would prefer to use context for better transcription accuracy. It seems like some sort of buffer overflow or memory management issue might be causing the repetition of old data.

Any help in diagnosing and resolving this problem would be greatly appreciated.