Open Crischan opened 1 week ago
In Dockerfile, after installing CUDA and NVIDIA driver, try installing CUDA runtimes libraries from gpt4all
pip install "gpt4all[cuda]"
Also, when downloading the model specify the driver, something like this
from gpt4all import GPT4All
model = GPT4All("Meta-Llama-3-8B-Instruct.Q4_0.gguf", device="cuda")
Bug Report
Hi, using a Docker container with Cuda 12 on Ubuntu 22.04, the Nvidia GForce 3060 is working with Langchain (e.g. when using a local model), but the Langchain Gpt4all Functions from GPT4AllEmbeddings raise a warning and use CPU only:
Failed to load libllamamodel-mainline-cuda.so: dlopen: libcudart.so.11.0: cannot open shared object file: No such file or directory Failed to load libllamamodel-mainline-cuda-avxonly.so: dlopen: libcudart.so.11.0: cannot open shared object file: No such file or directory
What is the typical Linux setup for Cuda 12, and/or which packages are supposed to exist so that the Cuda 12 functionality is recognized in Gpt4all, and Cuda 11 is not being searched for?
Thank you.
Example Code
from langchain_chroma import Chroma from langchain_community.embeddings import GPT4AllEmbeddings ... model_name = "nomic-embed-text-v1.5.f16.gguf" #"all-MiniLM-L6-v2.gguf2.f16.gguf" gpt4all_kwargs = {"allow_download": "True"} embeddings = GPT4AllEmbeddings(model_name=model_name, gpt4all_kwargs=gpt4all_kwargs) ... vectorstore = Chroma(collection_name="rag-embeddings", persist_directory=persist_directory, embedding_function=embeddings) vectorstore.reset_collection() ... vectorstore.add_documents(documents=[some_documents], ids=[some_ids])
Steps to Reproduce
Expected Behavior
There should be no warnings, but Cuda 12 recognized and working with gpt4all
Your Environment