Open jrosener opened 2 days ago
Is #2509 the same issue ? (probably)
When I increase GGML_MAX_CONTEXTS
in ggml/include/ggml.h
, main
doesn't crash. 8 is magic number but there should be some limitation.
refs: https://github.com/ggerganov/whisper.cpp/discussions/2520
When I increase
GGML_MAX_CONTEXTS
inggml/include/ggml.h
,main
doesn't crash. 8 is magic number but there should be some limitation.refs: #2520
Indeed, it works well by increasing this value (tested at 256, so running 32 parallel transcriptions). It would be great to make it possible to change it in whisper_context_params
or somewhere else.
I am writing an application that is able to transcribe multiple audio in parallel using the same model. For that I use one common
whisper_context
for multiplewhisper_state
used by worker threads where transcriptions processing are performed withwhisper_full_with_state()
. It works perfectly until 8 parallel transcriptions but crashes intowhisper_full_with_state()
if running more transcriptions.Because this implementation is based on
whisper_full_parallel()
used by themain
sample application, it is possible to reproduce the issue by running it using more than 8--processors
:./build/bin/main --model ggml-tiny.bin --processors 9 10min_audio_french.wav
It does not matter if it is running on cpu, openvino, cuda,...: it always crashes.Results:
Questions:
whisper_state
for 1 common context) a known limitation ?