Open thewh1teagle opened 10 months ago
Hi @slaren , is there a way to completely turn off OpenCL during runtime? Thanks!
Currently, there is no way to disable the GPU completely when the project is built with OpenCL support. Will think about fixing this.
In the meantime, does the information from https://github.com/ggerganov/whisper.cpp/issues/888 help in anyway?
@ggerganov It doesn't help. currenly I use openBlas so at least the performance is much better than without. Looking to improve it with the project vibe to get the best possible
@ggerganov I am also trying to turn off GPU use to allow for background processing on the iphone. Apologies if this is obvious, but is it possible for me to turn off the OpenCL support so that I can turn off the GPU use?
You can easily update ggml.c
to avoid all GPU calls (CUDA, OpenCL, etc.) if a global flag is set. For example here:
You can add a void ggml_gpu_set(bool enable);
call that sets a global boolean flag and check the flag before each GPU call in ggml.c
.
This is currently not officially supported in ggml
because I want to figure out a better API. But for quick workaround, I think this is the only option atm.
@ggerganov
I think that eventually it will be useful having is_avaibale()
function for each gpu method (cuda
, coreml
etc)
@ggerganov
Can we somehow get is_available()
functions per each GPU platform? so we can easily decide which to use?
I just added coreml
support for vibe
app and the performance incredible. (20x
faster and even more)
Also about the option for disable gpu using use_gpu = false
, do you have any progress / plans about it?
I'm eager to add support for GPU
for Linux
and Windows
as well.
hi, same issue on linux with a cuda build: still seems to init and use the cuda gpu despite the '-ng' cli argument:
./cuda/main -m ggml-base.en.bin -f samples/jfk.wav -ng
...
ggml_init_cublas: GGML_CUDA_FORCE_MMQ: no
ggml_init_cublas: CUDA_USE_TENSOR_CORES: yes
ggml_init_cublas: found 1 CUDA devices:
Device 0: Quadro RTX 3000, compute capability 7.5, VMM: yes
...
best
@WilliamTambellini this no longer happens with the CUDA backend after the sync with ggml from yesterday.
Tks @slaren Superb, I will pull and rebuild and retest. Congrats.
Tks @slaren @ggerganov 1.5.4 is already few months old, from Jan 5th. Would you mind doing a new release? Best
I'll probably make a new one soon, yes
Is there an update regrading this feature?
Also, I can see that ggml.c
check if the cpu has avx2
/ cuda
etc at compile time rather than at runtime using __builtin_cpu_supports("...")
That results in crashing for instance if the cpu doesn't support avx2
while instead we can create assertion for all of that with better errors and maybe choose what gpu platform to use (not sure if possible) for instance I would like to use by default CLBLast
but if Cuda
is available then use it on Windows
.
whisper_context_params.use_gpu = false;
doesn't work. it still trying to use opencl and leads to crash (specific in my case with opencl)I use it in my project vibe And this option very important because I want to give users the best possible speed with GPU but fallback in case of error.