LLAMA_ASSERT: ....../llama-cpp-python/vendor/llama.cpp/llama.cpp:1800: !!kv_self.ctx

jiapei100 commented 1 year ago

By using command CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 python setup.py bdist_wheel, I can build out a wheel and have it installed as:

llama_cpp:
total 3.8M
-rwxrwxr-x 1 lvision lvision   46 Aug  7 22:59 __init__.py
-rwxrwxr-x 1 lvision lvision 3.7M Aug  7 22:59 libllama.so
-rwxrwxr-x 1 lvision lvision  43K Aug  7 22:59 llama_cpp.py
-rwxrwxr-x 1 lvision lvision  62K Aug  7 22:59 llama.py
-rwxrwxr-x 1 lvision lvision 2.1K Aug  7 22:59 llama_types.py
drwxrwxr-x 2 lvision lvision 4.0K Aug  7 22:59 __pycache__
drwxrwxr-x 3 lvision lvision 4.0K Aug  7 22:59 server

llama_cpp_python-0.1.77.dist-info:
total 36K
-rw-rw-r-- 1 lvision lvision  321 Aug  7 22:59 direct_url.json
-rw-rw-r-- 1 lvision lvision    4 Aug  7 22:59 INSTALLER
-rwxrwxr-x 1 lvision lvision 1.1K Aug  7 22:59 LICENSE.md
-rwxrwxr-x 1 lvision lvision 9.7K Aug  7 22:59 METADATA
-rw-rw-r-- 1 lvision lvision 2.2K Aug  7 22:59 RECORD
-rw-rw-r-- 1 lvision lvision    0 Aug  7 22:59 REQUESTED
-rwxrwxr-x 1 lvision lvision   10 Aug  7 22:59 top_level.txt
-rwxrwxr-x 1 lvision lvision   99 Aug  7 22:59 WHEEL

By using the default command CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install -e ., I'll build out the following, without llama_cpp.

llama_cpp_python-0.1.77.dist-info:
total 36K
-rw-rw-r-- 1 lvision lvision  104 Aug  7 23:02 direct_url.json
-rw-rw-r-- 1 lvision lvision    4 Aug  7 23:02 INSTALLER
-rwxrwxr-x 1 lvision lvision 1.1K Aug  7 23:02 LICENSE.md
-rw-rw-r-- 1 lvision lvision 9.7K Aug  7 23:02 METADATA
-rw-rw-r-- 1 lvision lvision 1022 Aug  7 23:02 RECORD
-rw-rw-r-- 1 lvision lvision    0 Aug  7 23:02 REQUESTED
-rw-rw-r-- 1 lvision lvision   10 Aug  7 23:02 top_level.txt
-rw-rw-r-- 1 lvision lvision   86 Aug  7 23:02 WHEEL

When I tried to run localGPT, I got:

➜  localGPT git:(main) ✗ python run_localGPT.py --device_type cuda
......
AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | 

Enter a query: Hello... How are you?
LLAMA_ASSERT: ....../llama-cpp-python/vendor/llama.cpp/llama.cpp:1800: !!kv_self.ctx
[3]    398690 IOT instruction (core dumped)  python run_localGPT.py --device_type cuda

Unbelievable, it resorts to llama.cpp under the folder vendor. from where it's been built. LLAMA_ASSERT: ....../llama-cpp-python/vendor/llama.cpp/llama.cpp:1800: !!kv_self.ctx.

Okay... Can anybody please tell me how to build llama-cpp-python from source and have it successfully installed in Release mode?

c0sogi commented 1 year ago

I got the same error too. You should downgrade the commit version of llama.cpp to 41c674161fb2459bdf7806d1eebead15bc5d046e

jiapei100 commented 1 year ago

@c0sogi

Tried.... Not working for me.

➜  llama.cpp git:(master-41c6741)

localGPT still got:

AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | 

Enter a query: Hi, how are you today?
ggml_new_tensor_impl: not enough space in the context's memory pool (needed 13745376, available 12582912)
[2]    1051032 segmentation fault (core dumped)  python run_localGPT.py --device_type cuda

abetlen / llama-cpp-python

LLAMA_ASSERT: ....../llama-cpp-python/vendor/llama.cpp/llama.cpp:1800: !!kv_self.ctx #586