x86 cuda model_service image does not run

lstocchi commented 10 months ago

When running the x86 model_service image you face this error

Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.9/site-packages/llama_cpp/llama_cpp.py", line 74, in _load_shared_library
    return ctypes.CDLL(str(_lib_path), **cdll_args)
  File "/usr/lib64/python3.9/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libcuda.so.1: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/locallm/chat_service.py", line 4, in <module>
    from llama_cpp import Llama
  File "/opt/app-root/lib64/python3.9/site-packages/llama_cpp/__init__.py", line 1, in <module>
    from .llama_cpp import *
  File "/opt/app-root/lib64/python3.9/site-packages/llama_cpp/llama_cpp.py", line 87, in <module>
    _lib = _load_shared_library(_lib_base_name)
  File "/opt/app-root/lib64/python3.9/site-packages/llama_cpp/llama_cpp.py", line 76, in _load_shared_library
    raise RuntimeError(f"Failed to load shared library '{_lib_path}': {e}")
RuntimeError: Failed to load shared library '/opt/app-root/lib64/python3.9/site-packages/llama_cpp/libllama.so': libcuda.so.1: cannot open shared object file: No such file or directory

lstocchi commented 10 months ago

The problem seems to be that i don't have a NVIDIA gpu and the intel one that i have it is not supported by CUDA. If i disable cuda by updating the containerfile

ENV CMAKE_ARGS="-DLLAMA_CUBLAS=off"
ENV FORCE_CMAKE=0

it works but it is extremely slow. I guess we need a different base image based on the gpu the user is using

slemeur commented 3 weeks ago

Closing obsolete

containers / podman-desktop-extension-ai-lab

x86 cuda model_service image does not run #55