abysssol / ollama-flake

A nix flake for https://github.com/ollama/ollama
Creative Commons Zero v1.0 Universal
44 stars 11 forks source link

No such file or directory for GPU arch : gfx1035 #5

Open adfaure opened 8 months ago

adfaure commented 8 months ago

Hello, I don't know if it should be the best place for this issue (maybe it belongs to the ollama official repository). My laptop has an iGPU AMD Radeon™ 680M.

I am using the flake to build ollama with rocm support.

{
  description = "learning";

  inputs = {
    nixpkgs.url = "github:NixOS/nixpkgs/master";
    ollama.url = "github:abysssol/ollama-flake";
    flake-utils.url = "github:numtide/flake-utils";
  };

  outputs = { self, nixpkgs, ollama, flake-utils }:

  flake-utils.lib.eachDefaultSystem (system:
    let
      pkgs = import nixpkgs {
        inherit system;
        config.allowUnfree = true;
      };
      ollama-rocm = ollama.packages.${system}.rocm;
    in {

      devShell = pkgs.mkShell {
        buildInputs = with pkgs; [

          valgrind
          poetry
          ruff
          stdenv.cc.cc.lib
          ollama-rocm
        ];
        LD_LIBRARY_PATH = "${pkgs.stdenv.cc.cc.lib}/lib";
      };
  });
}

When I try to generate some tokens via the API (I tried with mistral and phi) I have the following trace:

> ollama serve                                                                                                        nix-shell-env
time=2024-02-25T20:58:47.533+01:00 level=INFO source=images.go:710 msg="total blobs: 19"
time=2024-02-25T20:58:47.533+01:00 level=INFO source=images.go:717 msg="total unused blobs removed: 0"
time=2024-02-25T20:58:47.534+01:00 level=INFO source=routes.go:1019 msg="Listening on 127.0.0.1:11434 (version 0.1.26)"
time=2024-02-25T20:58:47.534+01:00 level=INFO source=payload_common.go:107 msg="Extracting dynamic libraries..."
time=2024-02-25T20:58:47.799+01:00 level=INFO source=payload_common.go:146 msg="Dynamic LLM libraries [cpu cpu_avx2 rocm cpu_avx]"
time=2024-02-25T20:58:47.799+01:00 level=INFO source=gpu.go:94 msg="Detecting GPU type"
time=2024-02-25T20:58:47.799+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library libnvidia-ml.so"
time=2024-02-25T20:58:47.799+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: []"
time=2024-02-25T20:58:47.799+01:00 level=INFO source=gpu.go:265 msg="Searching for GPU management library librocm_smi64.so"
time=2024-02-25T20:58:47.799+01:00 level=INFO source=gpu.go:311 msg="Discovered GPU libraries: [/nix/store/0x1y6by0mjcm1gn91rdn0bq5bh0f6l1i-rocm-smi-5.7.1/lib/librocm_smi64.so.5.0]"
time=2024-02-25T20:58:47.803+01:00 level=INFO source=gpu.go:109 msg="Radeon GPU detected"
time=2024-02-25T20:58:47.803+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-02-25T20:58:58.038+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-02-25T20:58:58.039+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
time=2024-02-25T20:58:58.039+01:00 level=INFO source=cpu_common.go:11 msg="CPU has AVX2"
loading library /tmp/nix-shell.YHbFgT/ollama2810121158/rocm/libext_server.so
time=2024-02-25T20:58:58.109+01:00 level=INFO source=dyn_ext_server.go:90 msg="Loading Dynamic llm server: /tmp/nix-shell.YHbFgT/ollama2810121158/rocm/libext_server.so"
time=2024-02-25T20:58:58.109+01:00 level=INFO source=dyn_ext_server.go:150 msg="Initializing llama server"

rocBLAS error: Cannot read /nix/store/ialcylww20hrrzy91agi6ncqrk528fzs-rocblas-5.7.1/lib/rocblas/library/TensileLibrary.dat: No such file or directory for GPU arch : gfx1035
 List of available TensileLibrary Files :
"/nix/store/ialcylww20hrrzy91agi6ncqrk528fzs-rocblas-5.7.1/lib/rocblas/library/TensileLibrary_lazy_gfx940.dat"
"/nix/store/ialcylww20hrrzy91agi6ncqrk528fzs-rocblas-5.7.1/lib/rocblas/library/TensileLibrary_lazy_gfx1100.dat"
"/nix/store/ialcylww20hrrzy91agi6ncqrk528fzs-rocblas-5.7.1/lib/rocblas/library/TensileLibrary_lazy_gfx908.dat"
"/nix/store/ialcylww20hrrzy91agi6ncqrk528fzs-rocblas-5.7.1/lib/rocblas/library/TensileLibrary_lazy_gfx1030.dat"
"/nix/store/ialcylww20hrrzy91agi6ncqrk528fzs-rocblas-5.7.1/lib/rocblas/library/TensileLibrary_lazy_gfx90a.dat"
"/nix/store/ialcylww20hrrzy91agi6ncqrk528fzs-rocblas-5.7.1/lib/rocblas/library/TensileLibrary_lazy_gfx942.dat"
"/nix/store/ialcylww20hrrzy91agi6ncqrk528fzs-rocblas-5.7.1/lib/rocblas/library/TensileLibrary_lazy_gfx906.dat"
"/nix/store/ialcylww20hrrzy91agi6ncqrk528fzs-rocblas-5.7.1/lib/rocblas/library/TensileLibrary_lazy_gfx900.dat"
"/nix/store/ialcylww20hrrzy91agi6ncqrk528fzs-rocblas-5.7.1/lib/rocblas/library/TensileLibrary_lazy_gfx1102.dat"
"/nix/store/ialcylww20hrrzy91agi6ncqrk528fzs-rocblas-5.7.1/lib/rocblas/library/TensileLibrary_lazy_gfx941.dat"
"/nix/store/ialcylww20hrrzy91agi6ncqrk528fzs-rocblas-5.7.1/lib/rocblas/library/TensileLibrary_lazy_gfx1101.dat"
"/nix/store/ialcylww20hrrzy91agi6ncqrk528fzs-rocblas-5.7.1/lib/rocblas/library/TensileLibrary_lazy_gfx803.dat"
[1]    409190 IOT instruction (core dumped)  ollama serve

Do you see any obvious issue ?

Thank you very much.

abysssol commented 8 months ago

It seems to me that the version of rocblas in nixpkgs (5.7.1) doesn't support your gpu. As far as I can tell, this is purely a nixpkgs problem, not an upstream ollama nor llama.cpp problem.

There's already an issue in upstream nixos/nixpkgs#280927 about updating to version 6, and a related pull request nixos/nixpkgs#289187, but I'm not sure if it's making progress. If that stalls for too long, I may try to update rocmPackages myself when I have enough free time, but for now your gpu will be unusable, unfortunately.

adfaure commented 8 months ago

Thank you very much for your insight, as I had trouble identifying the root of the issue alone. I will try to upgrade the package myself, as I have some spare time to spend these days. :)

Iron-Bound commented 8 months ago

Unfortunately the GPU is not officially support under Rocm, Maybe an option is the vulkan back end

adfaure commented 6 months ago

Hi, I've taken a closer look at this issue.

Currently, it seems sensible to wait for Vulkan support to be implemented in Ollama. In the meantime, I've modified the flake.nix and the build-ollama.nix in my fork that includes Vulkan support, assuming we can somehow pass this option to Ollama. --> https://github.com/adfaure/ollama-flake/commit/2edd38fabd013ae86e1eec6acb632c09265aab42

For local testing, I had to add the option "LLAMA_VULKAN=on" at this line.

abysssol commented 6 months ago

Have you tried enabling rocm on a recent version of ollama from nixpkgs? The version in nixpkgs-unstable has been using version 6 of rocm libraries, which may be worth trying to see if your gpu works with them. Just update your unstable channel/flake input, and check ollama --version to make sure it's 0.1.31.

With nixos-unstable:

services.ollama = {
  enable = true;
  acceleration = "rocm";
};

With nixos-23.11 and nixpkgs-unstable:

services.ollama = {
  enable = true;
  acceleration = "rocm";
  package = unstable.ollama;
};

Also, maybe setting LLAMA_VULKAN in nix could work?

 goBuild ((lib.optionalAttrs enableRocm {
   ROCM_PATH = rocmPath;
   CLBlast_DIR = "${clblast}/lib/cmake/CLBlast";
 }) // (lib.optionalAttrs enableCuda {
   CUDA_LIB_DIR = "${cudaToolkit}/lib";
   CUDACXX = "${cudaToolkit}/bin/nvcc";
   CUDAToolkit_ROOT = cudaToolkit;
+}) // (lib.optionalAttrs enableVulkan {
+  LLAMA_VULKAN = "on";
 }) // {
   inherit pname version src vendorHash;
adfaure commented 6 months ago

Oh, looks like now my iGPU is now supported (680M) (https://github.com/ROCm/ROCm/discussions/2932). I will try it thank you.

I tried the variable LLAMA_VULKAN, but the cmake still doesn't build llama with vulkan. I believe that gen_linux.sh needs to be updated to handle vulkan build. But maybe I am missing something.

adfaure commented 6 months ago

I will try soon, and I guess if it works I can close the issue that is specific to rocm.

It might be worth opening a dedicated issue for vulkan ?

abysssol commented 6 months ago

According to the linked discussion, you should set HSA_OVERRIDE_GFX_VERSION:

services.ollama = {
  enable = true;
  acceleration = "rocm";
  environmentVariables = {
    HSA_OVERRIDE_GFX_VERSION = "10.3.0";
  };
};

It might be worth opening a dedicated issue for vulkan ?

You might consider opening such an issue on the nixpkgs repo. If you do then ping me in it.