Closed asosnovsky closed 3 months ago
Jul 14 01:50:26 Nix ollama[1087239]: time=2024-07-14T01:50:26.372+03:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v12]"
Jul 14 01:50:26 Nix ollama[1087239]: time=2024-07-14T01:50:26.473+03:00 level=INFO source=types.go:98 msg="inference compute" id=GPU-0a730a3d-f915-32f2-19e0-bef38f6809b4 library=cuda compute=8.6 driver=12.5 name="NVIDIA GeForce RTX 3090" total="23.8 GiB" available="22.3 GiB"
Try change settings to something like that:
environmentVariables = {
OLLAMA_LLM_LIBRARY = "cuda";
LD_LIBRARY_PATH = "run/opengl-driver/lib";
};
@VeilSilence are you on unstable or nixos-24?
Unstable.
@VeilSilence just switched my configs to use unstable still the same.. :(
mind sharing your configs?
Maybe you should try setting services.xserver.videoDrivers = ["nvidia"];
? It seems to influence services.xserver.drivers
, which could potentially be read by other nixos modules even if you don't use xserver.
Have you tried a minimal ollama config with just the following? Maybe host
or one of the env vars is actually interfering?
services.ollama = {
enable = true;
acceleration = "cuda";
};
Also, if you just want to use unstable ollama I would recommend the following with a base of stable nixos-24.05, where unstable
is imported from a separate nixpkgs-unstable flake input:
services.ollama = {
enable = true;
acceleration = "cuda";
package = unstable.ollama;
};
Also, are you aware that your enableNetowrking
option is misspelled with the wo
swapped to ow
? As is homeMangerVersion
, with the nag
dropping the a
to ng
.
@abysssol thank you for catching those (embarassing) I figured the issue... I never removed the old gpu I had on this (just removed the power from it). Looks like properly taking it out fixed my issues.
@asosnovsky @abysssol I have a RTX 3060 aswell and Ollama isn't using my GPU but it is being detected! I don't have any other GPU's and the iGPU of my CPU (Intel i5-12600) is disabled in the BIOS.
My config:
ollama = {
enable = true;
acceleration = "cuda";
environmentVariables = {
OLLAMA_LLM_LIBRARY = "cuda";
LD_LIBRARY_PATH = "run/opengl-driver/lib";
};
};
I am getting the same 100% CPU log from Ollama as at the top of this issue,
Journalctl logs:
Jul 10 09:43:05 nixos ollama[1989]: time=2024-07-10T09:43:05.270+10:00 level=INFO source=routes.go:1111 msg="Listening on 127.0.0.1:11434 (version 0.1.47)"
Jul 10 09:43:05 nixos ollama[1989]: time=2024-07-10T09:43:05.271+10:00 level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama3450513807/runners
Jul 10 09:43:09 nixos ollama[1989]: time=2024-07-10T09:43:09.081+10:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cuda_v12 cpu cpu_avx cpu_avx2]"
Jul 10 09:43:09 nixos ollama[1989]: time=2024-07-10T09:43:09.148+10:00 level=INFO source=types.go:98 msg="inference compute" id=GPU-6814f4a1-a623-257e-fdcf-1da7dbff1e59 library=cuda compute=8.6 driver=12.2 name="NVIDIA GeForce RTX 3060" total="11.8 GiB" available="11.6 GiB"
@asosnovsky
@abysssol
I have a RTX 3060 aswell and Ollama isn't using my GPU but it is being detected!
I don't have any other GPU's and the iGPU of my CPU (Intel i5-12600) is disabled in the BIOS.
My config:
ollama = { enable = true; acceleration = "cuda"; environmentVariables = { OLLAMA_LLM_LIBRARY = "cuda"; LD_LIBRARY_PATH = "run/opengl-driver/lib"; }; };
I am getting the same 100% CPU log from Ollama as at the top of this issue,
Journalctl logs:
Jul 10 09:43:05 nixos ollama[1989]: time=2024-07-10T09:43:05.270+10:00 level=INFO source=routes.go:1111 msg="Listening on 127.0.0.1:11434 (version 0.1.47)" Jul 10 09:43:05 nixos ollama[1989]: time=2024-07-10T09:43:05.271+10:00 level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama3450513807/runners Jul 10 09:43:09 nixos ollama[1989]: time=2024-07-10T09:43:09.081+10:00 level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cuda_v12 cpu cpu_avx cpu_avx2]" Jul 10 09:43:09 nixos ollama[1989]: time=2024-07-10T09:43:09.148+10:00 level=INFO source=types.go:98 msg="inference compute" id=GPU-6814f4a1-a623-257e-fdcf-1da7dbff1e59 library=cuda compute=8.6 driver=12.2 name="NVIDIA GeForce RTX 3060" total="11.8 GiB" available="11.6 GiB"
Honestly after I did another nix flake update about a week ago, this stopped working for me again. I think because ollama is still niche, it could be that the developer working on this has not had the proper time to support it well. I manage to get this working properly by doing a pass through with nvidia to docker (which is more widely used and supported) and then spinning up ollama with an official docker image.
See my container config https://github.com/asosnovsky/nixos-setup/blob/753f21b3fe95b5d0259c37a5186900fd19bfa5c1/hosts/hl-bigbox1.nix#L18
And docket nvidia pass through https://github.com/asosnovsky/nixos-setup/blob/753f21b3fe95b5d0259c37a5186900fd19bfa5c1/hosts/hl-bigbox1.nix#L6
Thanks @asosnovsky
I'll give that docker container setup a try!
Edit:
Works perfectly thank you, I just had to reinstall my models with docker exec ollama ollama pull my-model
Thanks @asosnovsky
I'll give that docker container setup a try!
Edit:
Works perfectly thank you, I just had to reinstall my models with
docker exec ollama ollama pull my-model
You can install the ollama cli via pkgs.ollama and set the OLLAMA_HOST env to point at the container. That way you can manage it with the standard cli tool (or any gui)
Describe the bug
Bought this GPU specifically because it was listed at the top of the recommended hardware on ollama Tried to enable ollama as well the overall nvidia support, but nothing seems to be picking up the device.
Here is my config: https://github.com/asosnovsky/nixos-setup/blob/main/hosts/hl-bigbox1.nix
Here is my device info
Here is my nix info
Steps To Reproduce
Screenshots
Ollama serve logs from systemctl
and while running ollama run llama3:8b
Metadata
"x86_64-linux"
Linux 6.9.6, NixOS, 24.05 (Uakari), 24.05.20240624.fc07dc3
yes
yes
nix-env (Nix) 2.18.2
/nix/store/r5clili0iqprbn4dnngkywsgxm51a5cw-source