Open bluenevus opened 5 months ago
Hi,are you running the nvidia container toolkit from https://github.com/NVIDIA/nvidia-container-toolkit ?
I had to do this https://github.com/allenporter/k8s-gitops/issues/1410 doing this https://github.com/allenporter/k8s-gitops/blob/3c818e6811902ecfb9f070ccb74760792aaf9033/bootstrap/kairos/nvidia/100_nvidia.yaml#L2 when using kairos
definitely running the toolkit. its working for other things like ollama, I just can't get this to work
OK -- this is the HelmRelease
I am currently using: https://github.com/allenporter/k8s-gitops/blob/main/kubernetes/ml/prod/llama-cublas/release.yaml
to run with gpu you need to use:
docker run --gpus all -it
When I try to use your llama-cpp-server-cuda:main I get this error
ghcr.io/allenporter/llama-cpp-server-cuda:main docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.