nomic-ai / gpt4all

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
https://nomic.ai/gpt4all
MIT License
70.06k stars 7.66k forks source link

Python Gpt4all, Docker Linux Container with Cuda 12, complains about missing Cuda 11 and uses CPU only #3033

Open Crischan opened 1 week ago

Crischan commented 1 week ago

Bug Report

Hi, using a Docker container with Cuda 12 on Ubuntu 22.04, the Nvidia GForce 3060 is working with Langchain (e.g. when using a local model), but the Langchain Gpt4all Functions from GPT4AllEmbeddings raise a warning and use CPU only:

Failed to load libllamamodel-mainline-cuda.so: dlopen: libcudart.so.11.0: cannot open shared object file: No such file or directory Failed to load libllamamodel-mainline-cuda-avxonly.so: dlopen: libcudart.so.11.0: cannot open shared object file: No such file or directory

What is the typical Linux setup for Cuda 12, and/or which packages are supposed to exist so that the Cuda 12 functionality is recognized in Gpt4all, and Cuda 11 is not being searched for?

Thank you.

Example Code

from langchain_chroma import Chroma from langchain_community.embeddings import GPT4AllEmbeddings ... model_name = "nomic-embed-text-v1.5.f16.gguf" #"all-MiniLM-L6-v2.gguf2.f16.gguf" gpt4all_kwargs = {"allow_download": "True"} embeddings = GPT4AllEmbeddings(model_name=model_name, gpt4all_kwargs=gpt4all_kwargs) ... vectorstore = Chroma(collection_name="rag-embeddings", persist_directory=persist_directory, embedding_function=embeddings) vectorstore.reset_collection() ... vectorstore.add_documents(documents=[some_documents], ids=[some_ids])

Steps to Reproduce

  1. Docker Container, image nvidia/cuda:12.6.1-devel-ubuntu22.04 with GPU enabled and working
  2. Using Gpt4All Python / Langchain methods as stated above

Expected Behavior

There should be no warnings, but Cuda 12 recognized and working with gpt4all

Your Environment

mbarbe commented 16 hours ago

In Dockerfile, after installing CUDA and NVIDIA driver, try installing CUDA runtimes libraries from gpt4all pip install "gpt4all[cuda]"

Also, when downloading the model specify the driver, something like this

from gpt4all import GPT4All
model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf", device="cuda")