Open adfaure opened 8 months ago
It seems to me that the version of rocblas in nixpkgs (5.7.1) doesn't support your gpu. As far as I can tell, this is purely a nixpkgs problem, not an upstream ollama nor llama.cpp problem.
There's already an issue in upstream nixos/nixpkgs#280927 about updating to version 6, and a related pull request nixos/nixpkgs#289187, but I'm not sure if it's making progress. If that stalls for too long, I may try to update rocmPackages myself when I have enough free time, but for now your gpu will be unusable, unfortunately.
Thank you very much for your insight, as I had trouble identifying the root of the issue alone. I will try to upgrade the package myself, as I have some spare time to spend these days. :)
Unfortunately the GPU is not officially support under Rocm, Maybe an option is the vulkan back end
Hi, I've taken a closer look at this issue.
Currently, it seems sensible to wait for Vulkan support to be implemented in Ollama. In the meantime, I've modified the flake.nix
and the build-ollama.nix
in my fork that includes Vulkan support, assuming we can somehow pass this option to Ollama. --> https://github.com/adfaure/ollama-flake/commit/2edd38fabd013ae86e1eec6acb632c09265aab42
For local testing, I had to add the option "LLAMA_VULKAN=on" at this line.
Have you tried enabling rocm on a recent version of ollama from nixpkgs? The version in nixpkgs-unstable has been using version 6 of rocm libraries, which may be worth trying to see if your gpu works with them. Just update your unstable channel/flake input, and check ollama --version
to make sure it's 0.1.31
.
With nixos-unstable:
services.ollama = {
enable = true;
acceleration = "rocm";
};
With nixos-23.11 and nixpkgs-unstable:
services.ollama = {
enable = true;
acceleration = "rocm";
package = unstable.ollama;
};
Also, maybe setting LLAMA_VULKAN
in nix could work?
goBuild ((lib.optionalAttrs enableRocm {
ROCM_PATH = rocmPath;
CLBlast_DIR = "${clblast}/lib/cmake/CLBlast";
}) // (lib.optionalAttrs enableCuda {
CUDA_LIB_DIR = "${cudaToolkit}/lib";
CUDACXX = "${cudaToolkit}/bin/nvcc";
CUDAToolkit_ROOT = cudaToolkit;
+}) // (lib.optionalAttrs enableVulkan {
+ LLAMA_VULKAN = "on";
}) // {
inherit pname version src vendorHash;
Oh, looks like now my iGPU is now supported (680M) (https://github.com/ROCm/ROCm/discussions/2932). I will try it thank you.
I tried the variable LLAMA_VULKAN
, but the cmake still doesn't build llama with vulkan. I believe that gen_linux.sh
needs to be updated to handle vulkan build. But maybe I am missing something.
I will try soon, and I guess if it works I can close the issue that is specific to rocm.
It might be worth opening a dedicated issue for vulkan ?
According to the linked discussion, you should set HSA_OVERRIDE_GFX_VERSION
:
services.ollama = {
enable = true;
acceleration = "rocm";
environmentVariables = {
HSA_OVERRIDE_GFX_VERSION = "10.3.0";
};
};
It might be worth opening a dedicated issue for vulkan ?
You might consider opening such an issue on the nixpkgs repo. If you do then ping me in it.
Hello, I don't know if it should be the best place for this issue (maybe it belongs to the ollama official repository). My laptop has an iGPU AMD Radeon™ 680M.
I am using the flake to build ollama with rocm support.
When I try to generate some tokens via the API (I tried with mistral and phi) I have the following trace:
Do you see any obvious issue ?
Thank you very much.