allenporter / llama-cpp-server

Docker images for easier running of llama-cpp-python server
Apache License 2.0
5 stars 2 forks source link

nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: #85

Open bluenevus opened 5 months ago

bluenevus commented 5 months ago

When I try to use your llama-cpp-server-cuda:main I get this error

ghcr.io/allenporter/llama-cpp-server-cuda:main docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.

allenporter commented 5 months ago

Hi,are you running the nvidia container toolkit from https://github.com/NVIDIA/nvidia-container-toolkit ?

I had to do this https://github.com/allenporter/k8s-gitops/issues/1410 doing this https://github.com/allenporter/k8s-gitops/blob/3c818e6811902ecfb9f070ccb74760792aaf9033/bootstrap/kairos/nvidia/100_nvidia.yaml#L2 when using kairos

bluenevus commented 5 months ago

definitely running the toolkit. its working for other things like ollama, I just can't get this to work

allenporter commented 5 months ago

OK -- this is the HelmRelease I am currently using: https://github.com/allenporter/k8s-gitops/blob/main/kubernetes/ml/prod/llama-cublas/release.yaml

kansoftware commented 1 week ago

to run with gpu you need to use: docker run --gpus all -it