Open ggerganov opened 1 year ago
Hey @ggerganov . So I've done some digging. Objective C is a headache, but necessary because Apple makes it a requirement to use metal. Unreal Engine is C++, and they use a C++ wrapper library, avoiding all objective C. This is the library by naleksiev.
https://github.com/naleksiev/mtlpp/blob/master/LICENSE
Refactoring the ggml-metal.m file and relevant files to use this library would have the benefits of cutting out objective C, simplifying the code base, and squashing any bugs related to using objective C. Also would likely fix the numerous kernel loading bugs on Macs with AMD. This change should let Mac users utilize the GPU on whatever, shouldn't make a difference between M1, M2, and AMD
The mtlpp library has been tried and tested with Unreal Engine, so it probably will do the heaving lifting without too much pain.
✨✨ Here's an AI-assisted sketch of how you might approach this issue saved by @ggerganov using Copilot Workspace v0.17
Playing with the tech preview of "Copilot Workspaces": https://copilot-workspace.githubnext.com/ggerganov/llama.cpp/issues/3229?shareId=9c38fc11-f7d8-45b7-b1bc-81678a27a9e0
It does not like big files 😢
Create a struct
ggml_metal_locals
and populate usingGGML_TENSOR_LOCALS
similar to what we do inggml.c
:https://github.com/ggerganov/llama.cpp/blob/3b4bab6a38502d9e68587c2c19f26472480ec4dd/ggml.c#L244-L256
Refactor all kernels to accept a single struct of
ggml_metal_locals
in order to avoid long lists of arguments such as:https://github.com/ggerganov/llama.cpp/blob/3b4bab6a38502d9e68587c2c19f26472480ec4dd/ggml-metal.m#L753-L782
https://github.com/ggerganov/llama.cpp/blob/3b4bab6a38502d9e68587c2c19f26472480ec4dd/ggml-metal.metal#L29-L61