Open petterreinholdtsen opened 8 months ago
CLBlast: OpenCL error: clEnqueueNDRangeKernel: -4
"-4" looks like an "out of memory." https://github.com/KhronosGroup/OpenCL-Headers/blob/59452533d2afa817bc2dc0da4f783097f4cdbcb0/CL/cl.h#L200
It is unclear to me how to debug this. Any clues to spare?
Perhaps you can try smaller models or larger memory.
[Tamotsu Takahashi]
"-4" looks like an "out of memory."
Aha. Perhaps time to rewrite the code to give more useful error messages.
The GPU only got 1 GiB of memory, so OOM sound likely.
Perhaps you can try smaller models or larger memory.
Is there a way to do like the main program, to only offload the part of the processing to the GPU that there is room for? I had to limit it to 4 layers to get 'main' working.
-- Happy hacking Petter Reinholdtsen
When trying to run the talk-llama example code with OpenCL enabled using a NVIDIA GeForce GT 755M, I get the following crash:
The build was done using
cmake -Bobj-x86_64-linux-gnu -S. -DWHISPER_CLBLAST=ON -DWHISPER_SDL2=1 -DCMAKE_BUILD_TYPE=Debug
and the models were fetched from https://huggingface.co/NbAiLab/nb-whisper-large/resolve/main/ggml-model.bin and https://huggingface.co/TheBloke/dolphin-2.2.1-mistral-7B-GGUF/resolve/main/dolphin-2.2.1-mistral-7b.Q5_K_M.gguf?download=true .It is unclear to me how to debug this. Any clues to spare?