NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.43k stars 13.64k forks source link

`ollama` fails to launch CUDA runner with `libcublas.so` not found #342385

Open magneticflux- opened 2 days ago

magneticflux- commented 2 days ago

Describe the bug

Running ollama fails when launching the cuda_v12 runner: /tmp/ollama1145927013/runners/cuda_v12/ollama_llama_server: error while loading shared libraries: libcublas.so.12: cannot open shared object file: No such file or directory

Steps To Reproduce

Steps to reproduce the behavior:

  1. Install ollama with nixpkgs.config.cudaSupport = true;
  2. Launch the server
  3. Run a model and observe that it fails

Expected behavior

Either /run/opengl-driver/lib should contain libcublas.so.12, or the ollama wrapper should add wherever libcublas.so is to LD_LIBRARY_PATH.

Additional context

When starting, ollama also prints this warning: level=WARN source=gpu.go:669 msg="unable to locate gpu dependency libraries".

I have 2x Quadro P5000s, using the closed drivers since open doesn't support them.

Notify maintainers

@abysssol @dit7ya @elohmeier @RoyDubnium

Metadata

Please run nix-shell -p nix-info --run "nix-info -m" and paste the result.

❯ nix-shell -p nix-info --run "nix-info -m"
 - system: `"x86_64-linux"`
 - host os: `Linux 6.10.9-xanmod1, NixOS, 24.11 (Vicuna), 24.11.20240916.dirty`
 - multi-user?: `yes`
 - sandbox: `yes`
 - version: `nix-env (Lix, like Nix) 2.92.0-dev-pre20240813-f9a3bf6
System type: x86_64-linux
Additional system types: i686-linux, x86_64-v1-linux, x86_64-v2-linux, x86_64-v3-linux, x86_64-v4-linux
Features: gc, signed-caches
System configuration file: /etc/nix/nix.conf
User configuration files: /home/mitchell/.config/nix/nix.conf:/etc/xdg/nix/nix.conf:/home/mitchell/.nix-profile/etc/xdg/nix/nix.conf:/nix/profile/etc/xdg/nix/nix.conf:/home/mitchell/.local/state/nix/profile/etc/xdg/nix/nix.conf:/etc/profiles/per-user/mitchell/etc/xdg/nix/nix.conf:/nix/var/nix/profiles/default/etc/xdg/nix/nix.conf:/run/current-system/sw/etc/xdg/nix/nix.conf
Store directory: /nix/store
State directory: /nix/var/nix
Data directory: /nix/store/3g6dv1l653bmhahjy7s2kbfdngbdv95m-lix-2.92.0-dev-pre20240813-f9a3bf6/share`
 - channels(root): `"nixos"`
 - nixpkgs: `/nix/store/bvr0ffimf1a2fwq0cgnnvin94n8xwb3q-ix4ysi12wciad08x6vskj6gq3r7i266a-source`

Add a :+1: reaction to issues you find important.

adamcstephens commented 2 days ago

I'm not actually a maintainer of ollama, but there are three others who should have received the ping.

That said, I wonder if this is fixed by https://github.com/NixOS/nixpkgs/pull/342127

magneticflux- commented 2 days ago

Sorry for the ping, I saw your PR and thought the same thing. I've tried with and without the changes from that PR and it doesn't fix the issue.

magneticflux- commented 2 days ago

This awful hack fixes the issue for me:

diff --git a/pkgs/by-name/ol/ollama/package.nix b/pkgs/by-name/ol/ollama/package.nix
index c1451d42faae..5f357e6035f9 100644
--- a/pkgs/by-name/ol/ollama/package.nix
+++ b/pkgs/by-name/ol/ollama/package.nix
@@ -102,7 +102,7 @@ let
       # these llama-cpp binaries are unaffected by the ollama binary's DT_RUNPATH
       # LD_LIBRARY_PATH is temporarily required to use the gpu
       # until these llama-cpp binaries can have their runpath patched
-      "--suffix LD_LIBRARY_PATH : '${addDriverRunpath.driverLink}/lib'"
+      "--suffix LD_LIBRARY_PATH : '${addDriverRunpath.driverLink}/lib:${lib.getLib cudaPackages.libcublas}/lib'"
     ]
     ++ lib.optionals enableRocm [
       "--suffix LD_LIBRARY_PATH : '${rocmPath}/lib'"
adamcstephens commented 1 day ago

I wonder if my change could be resolved by putting in LD_LIBRARY_PATH. I'll try and do some testing in the next couple days