Is it possible for llamafile to use Vulkan or OpenCL Acceleration?

Mozilla-Ocho / llamafile

Distribute and run LLMs with a single file.

https://llamafile.ai

Other

20.72k stars 1.05k forks source link

Is it possible for llamafile to use Vulkan or OpenCL Acceleration? #438

Open Ff-c109 opened 6 months ago

Ff-c109 commented 6 months ago

I want to use llamas on Intel's devices. If it is possible for it to use Vulkan or OpenCL, I think I may able to use Intel's GPU to accelerate it.

I have seen "README" file, and it says that it support AMD and Nvidia, But nothing about OpenCL or Vulkan. And I've notice that not every AMD GPU has ROCm support. I have read about the document of AMD Radeon Pro W6500. But It only says support OpenCL, but nothing about ROCm were found. I wander if this GPU is supported.

DjagbleyEmmanuel commented 5 months ago

I have an AMD card that doesn't support ROCm, SO what to do?

Ff-c109 commented 5 months ago

I've tried to re-compile llama.cpp. And I successfully got the OpenCL support. This way, AMD cards will surely work! llamafile doesn't support OpenCL, but llama.cpp supports. I've tried that on Linux, it works.

Ff-c109 commented 5 months ago

I'd like to fork this project for OpenCL supports. But it takes time.

Ff-c109 commented 5 months ago

It even works on Intel UHD cards. LoL

metal3d commented 5 months ago

FYI, OpenCL is now deprecated by llamacpp (it's a pity) and they now only concentrate on Vulkan.

Vulkan support is one of the reasons I don't use llamafiles at this time. I don't want CUDA, even if it's efficient, because it is closed source. Mozilla defends opensource and should (IMHO) follow what does llamacpp.

Vulkan or clblast, whatever the solution, must be used.

Ff-c109 commented 5 months ago

Tell me why!!! I love OpenCL very much

Ff-c109 commented 5 months ago

I can run it on any GPU or FPGA with OpenCL. OpenCL is a open standard, But CUDA is own by Nvidia. I just don't know why everybody removes OpenCL support. First, PyTorch Then, LLaMa.cpp I am pretty angry

Ff-c109 commented 5 months ago

If it really removed by llama.cpp, why can I still find this? https://github.com/ggerganov/llama.cpp/blob/master/README-sycl.md

metal3d commented 5 months ago

Syscl isn't direct OpenCL. It's a library that makes c++ compilation to OpenCL and it is supported by Intel. Anyway, I agree that removing OpenCL to impose Vulkan is not correct. Both have different direction. OpenCL can work on devices that are GPU or not. While Vulkan is graphic based and only work on GPU.

Le sam. 15 juin 2024, 14:34, 羽梦 @.***> a écrit :

If it really removed by llama.cpp, why can I still find this? https://github.com/ggerganov/llama.cpp/blob/master/README-sycl.md

— Reply to this email directly, view it on GitHub https://github.com/Mozilla-Ocho/llamafile/issues/438#issuecomment-2169486815, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAYN4CZ5Y2F7BFIVNFSQELZHQYGXAVCNFSM6AAAAABIFV3DS6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRZGQ4DMOBRGU . You are receiving this because you commented.Message ID: @.***>

BBC-Esq commented 3 months ago

Is there any update as to whether llamafile supports the Vulkan backend from llama.cpp?

jart commented 3 months ago

It's not on the roadmap.

lovenemesis commented 2 months ago

It would be great to see this one implement soon. sd.cpp had adopted the upstream change, and it worked very well in terms of wider range of GPU support.