abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
8.17k stars 974 forks source link

BUG: import error or execute error (NULL pointer access) for the latest prebuilt version `v0.2.81` #1571

Open ChengjieLi28 opened 4 months ago

ChengjieLi28 commented 4 months ago

For the latest version v0.2.81:

  1. If I install it via prebuilt channel:
    pip install -U llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu

    Import error happens:

    from llama_cpp import Llama

    Error:

    
    In [1]: from llama_cpp import Llama
    ---------------------------------------------------------------------------
    OSError                                   Traceback (most recent call last)
    File ~/miniconda3/lib/python3.9/site-packages/llama_cpp/llama_cpp.py:75, in _load_shared_library(lib_base_name)
     74 try:
    ---> 75     return ctypes.CDLL(str(_lib_path), **cdll_args)  # type: ignore
     76 except Exception as e:

File ~/miniconda3/lib/python3.9/ctypes/init.py:374, in CDLL.init(self, name, mode, handle, use_errno, use_last_error, winmode) 373 if handle is None: --> 374 self._handle = _dlopen(self._name, mode) 375 else:

OSError: libc.musl-x86_64.so.1: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

RuntimeError Traceback (most recent call last) Cell In[1], line 1 ----> 1 from llama_cpp import Llama

File ~/miniconda3/lib/python3.9/site-packages/llama_cpp/init.py:1 ----> 1 from .llama_cpp import 2 from .llama import 4 version = "0.2.81"

File ~/miniconda3/lib/python3.9/site-packages/llama_cpp/llama_cpp.py:88 85 _lib_base_name = "llama" 87 # Load the library ---> 88 _lib = _load_shared_library(_lib_base_name) 91 # ctypes sane type hint helpers 92 # 93 # - Generic Pointer and Array types (...) 96 # NOTE: Only use these for static type checking not for runtime checks 97 # no good will come of that 99 if TYPE_CHECKING:

File ~/miniconda3/lib/python3.9/site-packages/llama_cpp/llama_cpp.py:77, in _load_shared_library(lib_base_name) 75 return ctypes.CDLL(str(_lib_path), **cdll_args) # type: ignore 76 except Exception as e: ---> 77 raise RuntimeError(f"Failed to load shared library '{_lib_path}': {e}") 79 raise FileNotFoundError( 80 f"Shared library with base name '{lib_base_name}' not found" 81 )

RuntimeError: Failed to load shared library '/home/lichengjie/miniconda3/lib/python3.9/site-packages/llama_cpp/lib/libllama.so': libc.musl-x86_64.so.1: cannot open shared object file: No such file or directory


2. I install prebuilt version in `Docker`, the import can be passed successfully. However, the `NULL pointer access` happend when executing inference.
The error stack:

File "/opt/conda/lib/python3.11/site-packages/llama_cpp/llama.py", line 1132, in _create_completion for token in self.generate( ^^^^^^^^^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/llama_cpp/llama.py", line 740, in generate self.eval(tokens) File "/opt/conda/lib/python3.11/site-packages/llama_cpp/llama.py", line 590, in eval logits = np.ctypeslib.as_array(self._ctx.get_logits(), shape=(rows * cols, )) ^^^^^^^^^^^^^^^^^ File "/opt/conda/lib/python3.11/site-packages/numpy/ctypeslib.py", line 522, in as_array obj = ctypes.cast(obj, p_arr_type).contents ^^^^^^^^^^^^^^^^^ ValueError: [address=0.0.0.0:37293, pid=88] NULL pointer access



I tried prebuilt `v0.2.77`, everything is ok.
Could you please provide some solutions for these issues? Thank you.
ChengjieLi28 commented 4 months ago

Update: when I use prebuilt v0.2.82, still got null pointer access error...

ChengjieLi28 commented 4 months ago

Update: @abetlen Hi, I have figured out the cause of the null pointer access issue. If you load model with embedding=True option:

llm = Llama(model_path=<model_path>,verbose=True, embedding=True)

Then you do a generate task, the null pointer access error happens.

Is this parameter still supported? If it is no longer supported, please remove it from the documentation: https://llama-cpp-python.readthedocs.io/en/latest/#embeddings. Nowadays, few people use LLM for embedding tasks.

vultj commented 4 months ago

I believe I am running into this error as well on v0.2.82:

2024-07-12 19:13:01.785 | ERROR    | __main__:main:175 - NULL pointer access
Traceback (most recent call last):

  File "/app/main.py", line 547, in <module>
    main()
    └ <function main at 0xfffd086cb910>

> File "/app/main.py", line 168, in main
    inf = infer_queue(qargs)
          │           └ <__main__.QueueArgs object at 0xfffc2fc1f640>
          └ <function infer_queue at 0xfffc2fc37d90>

  File "/app/main.py", line 344, in infer_queue
    llm_output = llm(
                 └ <llama_cpp.llama.Llama object at 0xfffc2fc1f340>

  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py", line 1674, in __call__
    return self.create_completion(
           │    └ <function Llama.create_completion at 0xfffc305c0430>
           └ <llama_cpp.llama.Llama object at 0xfffc2fc1f340>
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py", line 1607, in create_completion
    completion: Completion = next(completion_or_chunks)  # type: ignore
                                  └ <generator object Llama._create_completion at 0xfffc2fc12570>
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py", line 1132, in _create_completion
    for token in self.generate(
                 │    └ <function Llama.generate at 0xfffc305c01f0>
                 └ <llama_cpp.llama.Llama object at 0xfffc2fc1f340>
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py", line 740, in generate
    self.eval(tokens)
    │    │    └ [1, 28705, 13, 28789, 28766, 6574, 28766, 28767, 13, 2707, 1504, 6031, 395, 1656, 28725, 3116, 28725, 304, 5307, 28723, 1992,...
    │    └ <function Llama.eval at 0xfffc305c00d0>
    └ <llama_cpp.llama.Llama object at 0xfffc2fc1f340>
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py", line 590, in eval
    logits = np.ctypeslib.as_array(self._ctx.get_logits(), shape=(rows * cols, ))
             │  │         │        │    │    │                    │      └ 32000
             │  │         │        │    │    │                    └ 1
             │  │         │        │    │    └ <function _LlamaContext.get_logits at 0xfffc305a9cf0>
             │  │         │        │    └ <llama_cpp._internals._LlamaContext object at 0xfffc2fc1ecb0>
             │  │         │        └ <llama_cpp.llama.Llama object at 0xfffc2fc1f340>
             │  │         └ <function as_array at 0xfffc2fc4d750>
             │  └ <module 'numpy.ctypeslib' from '/usr/local/lib/python3.10/dist-packages/numpy/ctypeslib.py'>
             └ <module 'numpy' from '/usr/local/lib/python3.10/dist-packages/numpy/__init__.py'>
  File "/usr/local/lib/python3.10/dist-packages/numpy/ctypeslib.py", line 522, in as_array
    obj = ctypes.cast(obj, p_arr_type).contents
          │      │    │    └ <class 'numpy.ctypeslib.LP_c_float_Array_32000'>
          │      │    └ <llama_cpp.llama_cpp.LP_c_float object at 0xfffc2fc30fc0>
          │      └ <function cast at 0xfffd07f90790>
          └ <module 'ctypes' from '/usr/lib/python3.10/ctypes/__init__.py'>

ValueError: NULL pointer access
AdjectiveAllison commented 4 months ago

Hey, I was able to pinpoint where this bug was introduced with git bisect.

First commit that introduced it was: https://github.com/abetlen/llama-cpp-python/commit/5beec1a1fd28acf6e32d45a6b990ef6389d288ed

Conveniently, the only changes were to llama.cpp, so I then went to go bisect llama.cpp and was able to find the culprit: https://github.com/ggerganov/llama.cpp/commit/80ea089d771f0c2d97afa8bead80ded412f600d7

I'm a little out of my depth with this sutff, but luckily again the change is only a couple hundred lines.

The parts that look relevant to me are:

  1. Changes in logit handling:

    -    const bool has_logits = cparams.causal_attn;
    -    const bool has_embd   = cparams.embeddings && (hparams.causal_attn || cparams.pooling_type == LLAMA_POOLING_TYPE_NONE);
    +    const bool has_logits = !cparams.embeddings;
    +    const bool has_embd   =  cparams.embeddings && (cparams.pooling_type == LLAMA_POOLING_TYPE_NONE);
  2. New pooling behavior:

    +    // add on pooling layer
    +    if (lctx.cparams.embeddings) {
    +        result = llm.append_pooling(result);
    +    }
  3. Changes in result tensor selection:

    -            res = nullptr; // do not extract logits for embedding models such as BERT
    -
    -            // token or sequence embeddings
    -            embd = gf->nodes[gf->n_nodes - 1];
    -
    -            GGML_ASSERT(strcmp(embd->name, "result_embd") == 0 || strcmp(embd->name, "result_embd_pooled") == 0);
    +            res = nullptr; // do not extract logits for embedding case
    +            embd = gf->nodes[gf->n_nodes - 1];
    +            if (strcmp(embd->name, "result_embd_pooled") != 0) {
    +                embd = gf->nodes[gf->n_nodes - 2];
    +            }
    +            GGML_ASSERT(strcmp(embd->name, "result_embd_pooled") == 0 && "missing embeddings tensor");
  4. New function to set embedding mode:

    +void llama_set_embeddings(struct llama_context * ctx, bool embeddings) {
    +    ctx->cparams.embeddings = embeddings;
    +}

I don't see llama_set_embeddings being called anywhere in this repo, so I think updating to use that will be involved in the fix. Then potentially checking if logits are available before trying to access them in embedding mode and handling the new pooling behavior might be needed as well.

@abetlen -- I think this is enough for you to likely know exactly what needs to change! Thanks for your effort in maintaining the library!

vultj commented 4 months ago

This bug appears to still be in v0.2.83 as well

dewijones92 commented 3 months ago

I would love to use the embedding feature but I am encountering the same issue :(

RoryMB commented 3 months ago

+1, I have the same issue while making embeddings for vector databases. Working fine until v0.2.77

vultj commented 2 months ago

Tested on v0.2.90 same error:

Traceback (most recent call last):
  File "/app/main.py", line 547, in <module>
    main()
  File "/app/main.py", line 141, in main
    inf = infer(args)
  File "/app/main.py", line 453, in infer
    for i in llm_output:
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py", line 1216, in _create_completion
    for token in self.generate(
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py", line 808, in generate
    self.eval(tokens)
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py", line 655, in eval
    logits = np.ctypeslib.as_array(
  File "/usr/local/lib/python3.10/dist-packages/numpy/ctypeslib.py", line 538, in as_array
    obj = ctypes.cast(obj, p_arr_type).contents
ValueError: NULL pointer access

This error is thrown for the zephyr-7b q5 model. It has been thrown between v0.2.79 to v0.2.90. @abetlen