ClaudiuCreanga commented 1 year ago

Prerequisites

Please answer the following questions for yourself before submitting an issue.

[x ] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
[ x] I carefully followed the README.md.
[ x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
[x ] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Please provide a detailed written description of what you were trying to do, and what you expected llama-cpp-python to do.

from llama_cpp import Llama

llm = Llama(
    model_path="./models/llama-2-7b-chat.gguf.bin"
)

output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=["Q:", "\n"], echo=True)
print(output)

I should get: The planets in our solar system are: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, and Neptune.

Current Behavior

from llama_cpp import Llama

llm = Llama(
    model_path="./models/llama-2-7b-chat.gguf.bin"
)

output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=["Q:", "\n"], echo=True)
print(output)

I get 'choices': [{'text': 'Q: Name the planets in the solar system? A: 1. Unterscheidung between celestial body and planet. Celestial body refers to any object that is in orbit around the Sun , including dwarf plan', 'index': 0, 'logprobs': None, 'finish_reason': 'length'}]

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

Physical (or virtual) hardware you are using, e.g. for Linux:

Mac M1 13.5.1 (22G90)

$ python3 --version: Python 3.9.6
$ make --version: GNU Make 3.81
$ g++ --version: Apple clang version 14.0.3

Failure Information (for bugs)

Please help provide information about the failure if this is a bug. If it is not a bug, please remove the rest of this template.

simonw commented 1 year ago

I think this is because you are using the Llama 2 Chat model, which has a specific format that you need to use for the prompts.

Try switching to the non-chat model.

m-from-space commented 1 year ago

I guess it's not about the model. It's about llama-cpp-python version 0.2.7. Switching back to version 0.2.6 will probably fix the issue. Try it out and tell us!

pip install llama-cpp-python==0.2.6 --upgrade --force-reinstall --no-cache-dir

LorenzoBoccaccia commented 1 year ago

same issue here, 0.2.6 works fine, 0.2.7 gives broken output. I guess it's not a llama-cpp-python issue per se, it might very well be with the underneath library revision bump

Josh-XT commented 1 year ago

Try this:

from llama_cpp import Llama

llm = Llama(
    model_path="./models/llama-2-7b-chat.gguf.bin", rope_freq_base=0, rope_freq_scale=0
)

output = llm("Q: Name the planets in the solar system? A: ", max_tokens=32, stop=["Q:", "\n"], echo=True)
print(output)

desperadoduck commented 11 months ago

thank you, the rope parameters seem to help here!

abetlen / llama-cpp-python

Weird output #761

Prerequisites

Expected Behavior

Current Behavior

Environment and Context

Failure Information (for bugs)