abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
7.96k stars 946 forks source link

Weird output #761

Open ClaudiuCreanga opened 1 year ago

ClaudiuCreanga commented 1 year ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

Expected Behavior

Please provide a detailed written description of what you were trying to do, and what you expected llama-cpp-python to do.

from llama_cpp import Llama

llm = Llama(
    model_path="./models/llama-2-7b-chat.gguf.bin"
)

output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=["Q:", "\n"], echo=True)
print(output)

I should get: The planets in our solar system are: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.

Current Behavior

from llama_cpp import Llama

llm = Llama(
    model_path="./models/llama-2-7b-chat.gguf.bin"
)

output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=["Q:", "\n"], echo=True)
print(output)

I get 'choices': [{'text': 'Q: Name the planets in the solar system? A: 1. Unterscheidung between celestial body and planet. Celestial body refers to any object that is in orbit around the Sun , including dwarf plan', 'index': 0, 'logprobs': None, 'finish_reason': 'length'}]

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

Mac M1 13.5.1 (22G90)

$ python3 --version: Python 3.9.6
$ make --version: GNU Make 3.81
$ g++ --version: Apple clang version 14.0.3 

Failure Information (for bugs)

Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.

simonw commented 1 year ago

I think this is because you are using the Llama 2 Chat model, which has a specific format that you need to use for the prompts.

Try switching to the non-chat model.

m-from-space commented 1 year ago

I guess it's not about the model. It's about llama-cpp-python version 0.2.7. Switching back to version 0.2.6 will probably fix the issue. Try it out and tell us!

pip install llama-cpp-python==0.2.6 --upgrade --force-reinstall --no-cache-dir

LorenzoBoccaccia commented 1 year ago

same issue here, 0.2.6 works fine, 0.2.7 gives broken output. I guess it's not a llama-cpp-python issue per se, it might very well be with the underneath library revision bump

Josh-XT commented 1 year ago

Try this:

from llama_cpp import Llama

llm = Llama(
    model_path="./models/llama-2-7b-chat.gguf.bin", rope_freq_base=0, rope_freq_scale=0
)

output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=["Q:", "\n"], echo=True)
print(output)
desperadoduck commented 11 months ago

thank you, the rope parameters seem to help here!