abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
7.82k stars 933 forks source link

Reinstall for gemma 2 #1559

Closed etemiz closed 3 months ago

etemiz commented 3 months ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

Expected Behavior

Run without issue

Current Behavior

raise RuntimeError(f"Failed to load shared library '{_lib_path}': {e}")

RuntimeError: Failed to load shared library '...........lib/python3.10/site-packages/llama_cpp/libllama.so': libggml.so: cannot open shared object file: No such file or directory

Environment and Context

Ubuntu22

Python 3.10.12 GNU Make 4.3 g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

Failure Information (for bugs)

Steps to Reproduce

I clone llama-cpp-python:

git clone --recurse-submodules https://github.com/abetlen/llama-cpp-python.git
pip install --upgrade pip
cd vendor
cd llama.cpp
git checkout b3262
pip install -e .

it installs llama-cpp-python but my script does not work and gives the error above (Failed to load shared library).

make clean
CMAKE_ARGS="-DLLAVA_BUILD=off -DGGML_HIPBLAS=on" python -m pip install --force-reinstall --no-cache-dir .

does not work.

I tried many different things like manual make in llama.cpp.

There is libggml.so in venv/lib/python3.10/site-packages/lib/libggml.so but not in venv/lib/python3.10/site-packages/llama_cpp/libggml.so and if I copy the file it still doesn't work.

llama-cli -m gemma2......gguf compiles and works fine.

fat-tire commented 3 months ago

Quick-but-not-best solution:

To find where it was actually built, you could try something like:

$ find ~ | grep libggml.so (assuming it's residing somewhere in your home directory)

Once you know where the library is, note the path, then set:

export LD_LIBRARY_PATH=<path_to_library_directory>}:$LD_LIBRARY_PATH

Then run your .py program.

In my case it was hiding at

/home/accountname/.local/lib/python3.10/site-packages/lib

etemiz commented 3 months ago

Thank you. That worked well.

werruww commented 3 months ago

how to run gemma 9b on llama-cpp-python itis run on ollama and lmstudio but not work on from llama_cpp import Llama

llm = Llama( model_path="./gemma-2-9b-it.Q4_K.gguf", # path to GGUF file n_ctx=4096, # The max sequence length to use - note that longer sequence lengths require much more resources n_threads=4, # The number of CPU threads to use, tailor to your system and the resulting performance n_gpu_layers=0, # The number of layers to offload to GPU, if you have GPU acceleration available. Set to 0 if no GPU acceleration is available on your system. )

prompt = "write the python code to create text file"

Simple inference example

output = llm( f"<|user|>\n{prompt}<|end|>\n<|assistant|>", max_tokens=256, # Generate up to 256 tokens stop=["<|end|>"], echo=True, # Whether to echo the prompt )

print(output['choices'][0]['text'])

abetlen commented 3 months ago

After the recent llama.cpp refactor I had to also update the cmake build a little bit, as of version 0.2.80 the build should work correctly and Gemma2 is supported

fat-tire commented 3 months ago

Can confirm. It works for me.