nomic-ai / gpt4all

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
https://nomic.ai/gpt4all
MIT License
70.41k stars 7.69k forks source link

CUDA 12.0 error while trying to run in CPU #2805

Open mohammad-qloo opened 2 months ago

mohammad-qloo commented 2 months ago

I tried to run in CPU but getting cuda error

Bug Report

import gpt4all 
      3 llma_8b = gpt4all.GPT4All(model_name="Meta-Llama-3-8B-Instruct.Q4_0.gguf", 
      4                           model_path="/repository/models/mohammad/llm_models/rag/", 
      5                           device="cpu",
      6                           allow_download=True)

Running this in Linux but getting the following error.

OSError: /home/mohammad/.virtualenv/rag_env/lib/python3.8/site-packages/nvidia/cuda_runtime/lib/libcudart.so.12: cannot open shared object file: No such file or directory

It seems _pyllmodel.py has this block of code that is trying to import the cuda 12 files. I although have cuda 11.8 in the system I want to run it on CPU due to insufficient memory.

if platform.system() in ('Linux', 'Windows'):
    try:
        from nvidia import cuda_runtime, cublas
    except ImportError:
        pass  # CUDA is optional
    else:
        if platform.system() == 'Linux':
            cudalib   = 'lib/libcudart.so.12'
            cublaslib = 'lib/libcublas.so.12'
        else:  # Windows
            cudalib   = r'bin\cudart64_12.dll'
            cublaslib = r'bin\cublas64_12.dll'

Your Environment

cosmic-snow commented 2 months ago

I haven't tried to reproduce this, but there should be a new release of the Python bindings soon. In fact, it seems like it's only being held up by an issue with CI. Oh and also some PyPI limitations.

Can you try it again once that is available?

Also see PR #2802 and note specifically:

  • Also search for CUDA 11 installed with pip at runtime since we now build against CUDA 11.8 anyway
cebtenzzre commented 2 months ago

For now it should be sufficient to pip install nvidia-cublas-cu12 nvidia-cuda-runtime-cu12 as long as your GPU driver is somewhat recent (at least 525.60.13 if we provide binary support for your GPU architecture, or 555.58 if we don't).

Support for CUDA 11 will be available in the next Python release (possibly 2.8.1).