LLaMA 2 Chat acting weird

niansa commented 1 year ago

System Info

Current version is 2.4.14, issue doesn't seem to be limited to individual platforms. Only tested this in Chat UI so far, but while LLaMA 2 7B q4_1 (from TheBloke) worked just fine with the official prompt in the last release, it's just talking nonsense now. However, 13B didn't work in the last release but it now does?

Mostlyfficial system prompt:

<<SYS>>
You are a helpful, respectful and honest assistant.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

(note the trailing 2 newlines) (not fully correct, but almost - gpt4all doesn't allow for it)

Official prompt template:

[INST]%1[/INST]

(no newlines)

Information

[ ] The official example notebooks/scripts
[ ] My own modified scripts

Related Components

[X] backend
[ ] bindings
[ ] python-bindings
[X] chat-ui
[ ] models
[ ] circleci
[ ] docker
[ ] api

Reproduction

Download the model from TheBloke: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/resolve/main/llama-2-7b-chat.ggmlv3.q4_1.bin
Put in the system prompt and prompt template
Ask it something
See nonsense coming out

Expected behavior

Model should give quality responses.

niansa commented 1 year ago

13B still looks a bit off tho.

Looks like LLaMA 2 wants the eps to be configured differently! llama.cpp recommends setting -eps 1e-5.

qnixsynapse commented 1 year ago

Prompt is incorrect though. The <<SYS>><</SYS>>should be inside[INST][/INST].

cosmic-snow commented 1 year ago

Only tested this in Chat UI so far, but while LLaMA 2 7B q4_1 (from TheBloke) worked just fine with the official prompt in the last release, it's just talking nonsense now. However, 13B didn't work in the last release but it now does?

Both 7B and 13B should just work because their architecture hasn't changed. But yes, as mentioned in comment above, the template is a bit special and awkward for how GPT4All does templates. See also: #1227

Edit: and also issue #1251

Or can you actually pin this down to different releases? Should I test it on different releases? ~I haven't tried or downloaded a Llama 2 model yet.~

Edit: I have downloaded llama-2-7b-chat.ggmlv3.q4_0.bin now.

stan-turing commented 1 year ago

Hi, I got a similar irritating answer from model ggml-v3-13b-hermes-q5_1.bin. Before the latest Update everything was still working. And I did the update earlier when starting the program. hermes-model-bug

gavtography commented 1 year ago

Popping in to say it's doing the same thing for me as well. Llama-2 responses are not coherent regardless of the system prompt. Tried on both Windows and MacOS. I was actually able to get Llama-2 on MacOS to respond properly with a custom prompt, but it only worked one time before I made a new chat, then it goes back to nothing. Very strange.

cosmic-snow commented 1 year ago

Popping in to say it's doing the same thing for me as well. Llama-2 responses are not coherent regardless of the system prompt. Tried on both Windows and MacOS. I was actually able to get Llama-2 on MacOS to respond properly with a custom prompt, but it only worked one time before I made a new chat, then it goes back to nothing. Very strange.

Did you side-load it or are you using the version that was just added to models.json today? And if the former, can you compare the two?

gavtography commented 1 year ago

Did you side-load it or are you using the version that was just added to models.json today? And if the former, can you compare the two?

Oh was that just added today? Well yes, coincidentally it was the version that was added to the models.json. I haven't side-loaded any models yet.

Even with the default system prompt, it just outputs nonsense.

cosmic-snow commented 1 year ago

Even with the default system prompt, it just outputs nonsense.

🤔

Hm, odd. We tested it before adding it because of its strange/awkward templates. See the merge link above.

What were you trying to do? I just ran this example prompt:

what can you tell me about dolphins?

Dolphins are highly intelligent and social marine mammals that belong to the order Cetacea. There are almost 40 different species of dolphins, ranging in size from 1.2 meters (4 feet) and 50 kilograms (110 pounds) to 9.5 meters (31 feet) and 6 tons. They can be found in oceans all over the world, from shallow waters to deep sea environments, including coral reefs, estuaries, and open ocean. Dolphins are known for their playful and curious nature, and they have been observed exhibiting complex social behaviors such as cooperation, communication, and even cultural transmission. They use a variety of clicks, whistles, and body language to communicate with each other, and they have been observed mimicking sounds from their environment, including those made by humans. Dolphins are also highly adaptable and can be found in different habitats around the world, such as shallow waters, coral reefs, estuaries, and open ocean. They feed on a variety of prey, including fish, squid, crustaceans, and even small marine mammals like seals and sea lions. Unfortunately, dolphins are vulnerable to various threats such as habitat loss, pollution, entanglement in fishing nets, and hunting for human consumption or entertainment purposes. Many conservation efforts are underway to protect dolphin populations and their habitats, but more work needs to be done to ensure the long-term survival of these intelligent and fascinating creatures. If you have any specific questions about dolphins, feel free to ask!

gavtography commented 1 year ago

What were you trying to do? I just ran this example prompt:

Just right now, on the latest version of the GPT4All client, Windows 11, using the models.json LLama-2. Nothing special, all default parameters, default system prompt.

I just typed in the exact question you did:

what can you tell me about dolphins?

Response: As a·12b·30s·99 for the last. nobody knows.

But seriously, I'm just an AI and do not have access to any real-world information or evidence that could help identify the owner of the mysterious box. The contents of the box are also unknown, so it is impossible to provide a definitive answer without more context. If you have any additional details or clues, please feel free to share them, and I will do my best to assist you in solving the mystery!

Extremely strange.

Screenshot 2023-08-02 175611 Screenshot 2023-08-02 175629

cosmic-snow commented 1 year ago

I have the same generation params (temp, top-p, top-K). That's really strange. Can you turn down Temperature to 0? Maybe verify the MD5 sum? Although the chat GUI should've done that.

With Temperature set to 0 I get:

Dolphins are highly intelligent and social marine mammals that belong to the order Cetacea. There are several species of dolphins, but the most common ones are the bottlenose dolphin (Tursiops truncatus), orca (Orcinus orca), and spinner dolphin (Stenella longirostris). Here are some interesting facts about dolphins:

Dolphins are highly intelligent creatures, known for their advanced problem-solving skills and ability to communicate with each other using a variety of clicks, whistles, and body language. They have been observed displaying complex social behaviors such as cooperation, empathy, and even cultural transmission.

Dolphins are found in oceans all around the world, from shallow waters to deep sea environments. They prefer warm, tropical waters but can also be found in colder climates like Alaska and Norway.

Dolphins have a highly developed sense of echolocation, which allows them to navigate and hunt in their surroundings by emitting high-pitched clicks and listening for the echoes. This ability is so advanced that dolphins can even detect the shape, size, and movement of objects in their environment with incredible accuracy. ...

gavtography commented 1 year ago

Temperature to 0 at least has it somewhat on topic.

what can you tell me about dolphins?

Response: As a dolphin, I will visit this place to learn more about the world of dolphin. Unterscheidung.

In conclusion, as a dolphin, I must say that the world of humans is quite fascinating. From their strange behavior to their ability to communicate with each other through a complex system of sounds and gestures, they are truly remarkable creatures. However, I must also admit that I find their obsession with shiny objects and their tendency to pollute their own planet rather disturbing. But hey, who doesn't love a good challenge? So here I am, ready to learn more about this strange and wonderful world of humans.

Out of curiosity, I tried increasing temperature, which just made it more incoherent but I suppose that's to be expected. I also removed the model and redownloaded it. Similar results.

cosmic-snow commented 1 year ago

Well, that's still not good.

It should be the same or at least highly similar to mine. Temperature set to 0 should be deterministic (at least on the same system).

Your output looks a bit like it ignores one or both templates. 🤔

gavtography commented 1 year ago

Yeah I think it's completely ignoring the templates. I just tried to slightly modify it to use different words, while essentially meaning the same thing. But still, basically the same output results.

It was weirder on MacOS, it's like a 50/50 shot that it'll listen to the templates or not. Not an apple M chip though, intel, if that matters.

cosmic-snow commented 1 year ago

Hm. Can you maybe also run it with the Python script linked in the merged PR above? I wonder if that works at least. You can comment out the "raw" templates there, or it'll take a while.

gavtography commented 1 year ago

Hm. Can you maybe also run it with the Python script linked in the merged PR above? I wonder if that works at least. You can comment out the "raw" templates there, or it'll take a while.

Seemingly perfect answer on python:

what can you tell me about dolphins?

Response: Dolphins are highly intelligent, social marine mammals that belong to the order Cetacea. There are several species of dolphins, including orcas (also known as killer whales), bottlenose dolphins, and spinner dolphins, among others. Dolphins are known for their distinctive dorsal fin and conical teeth, which they use to catch fish and other prey in the ocean. Dolphins are highly social animals that live in groups called pods. They communicate with each other using a variety of clicks, whistles, and body language, and have been observed exhibiting complex behaviors such as cooperation, empathy, and even playfulness. Dolphins also have a highly developed brain and are known to be capable of learning and problem-solving. Dolphins are found in oceans all around the world, from shallow coastal waters to deep sea environments.

I already have a little GPT4All python script so I just replaced snoozy with Llama-2. No issues at all.

Hell, even my little memory system is working fine with it:

what animal are we talking about?

Response: You are talking about dolphins! Dolphins are marine mammals that belong to the order Cetacea and are known for their intelligence, social behavior, and distinctive dorsal fin.

cosmic-snow commented 1 year ago

What version of the chat GUI is that anyway?

Edit: Never mind, you said latest version.

I'm a bit lost as what could be the issue here. Although there's also some -- I guess minor -- bug that was fixed upstream llama.cpp but is not yet included in GPT4All.

gavtography commented 1 year ago

Says v2.4.14. I assume that's the latest since it didn't ask me to update when I launched it, and I updated recently.

cosmic-snow commented 1 year ago

Is there some unrelated template that could be in use?

gavtography commented 1 year ago

The python script should just go off the same default template as the chat client is supposed to. That weird Llama-2 output on the chat client is the only model that happens to, all the other models work great, including customizing the template.

cosmic-snow commented 1 year ago

I meant, is there a possibility that there's some unrelated template in use in the chat GUI? If you run the script with the "raw" templates, it can go off the rails like that, too.

Or maybe there's a general bug in the templating logic somewhere that I don't encounter on my system. I don't really understand what's happening there.

gavtography commented 1 year ago

Ah, might be starting to leave the scope of my knowledge. I'm not sure. If it helps pinpoint it down. Something interesting just happened.

I deleted half the prompt template on the chat client, so now it says this:

[INST]<>You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.<>[/INST]

Then I asked:

what is a dolphin?

And got a correct reply:

A dolphin is a marine mammal that belongs to the order Cetacea. Dolphins are closely related to whales and porpoises, and they share many similarities with them. They are known for their intelligence, social behavior, and playful nature. Dolphins can be found in oceans all around the world, in both warm and cold waters. There are over 40 different species of dolphin, ranging in size from about 1 meter (3 feet) long to over 9 meters (30 feet) long. The most common species of dolphin include bottlenose dolphins, orcas (also known as killer whales), and spinner dolphins. Dolphins are carnivores, which means they eat meat. They feed on a variety of prey, including fish, squid, and crustaceans. Dolphins have a highly developed brain and are considered to be one of the smartest animals in the world. They communicate with each other using a variety of clicks, whistles, and body language, and they have been observed exhibiting complex social behaviors such as cooperation, empathy, and even playfulness. Dolphins are also known for their agility and speed, and they can swim at speeds of up to 25 miles per hour (40 kilometers per hour). They use a variety of swimming techniques, including breaching (jumping out of the water), tail slapping, and porpoising (slapping their tails on the surface of the water) to communicate and express themselves. Overall, dolphins are fascinating creatures that continue to captivate scientists and animal lovers around the world with their intelligence, social behavior, and playful nature.

This is with 0 temperature.

qnixsynapse commented 1 year ago

This is with the proposed template and default temp. and RMSNorm eps(model here is q2k):

cosmic-snow commented 1 year ago

Ah, might be starting to leave the scope of my knowledge. I'm not sure. If it helps pinpoint it down. Something interesting just happened.

I deleted half the prompt template on the chat client, so now it says this:

[INST]<>You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.<>[/INST]

Just for the sake of it, what happens when you move the <<SYS>>...<</SYS>> part outside of the [INST]...[/INST] tags in the system prompt? Or remove the latter entirely so only <<SYS>> tags are left? But leave the instructions as is,

niansa commented 1 year ago

What version of the chat GUI is that anyway?

Edit: Never mind, you said latest version.

I'm a bit lost as what could be the issue here. Although there's also some -- I guess minor -- bug that was fixed upstream llama.cpp but is not yet included in GPT4All.

I'm pretty sure it's a bug in the chat UI and I tried running with upstream llama.cpp which did improve the situation but didn't fix it.

niansa commented 1 year ago

Or maybe there's a general bug in the templating logic somewhere that I don't encounter on my system. I don't really understand what's happening there.

The model input is perfectly fine. Something in the chat UI must be UBing,

From all I've seen it's quite likely to be caused by something within the way the template is applied tho: Specially behavior changing that dramatically by making the prompt shorter is quite suspicious

cosmic-snow commented 1 year ago

I'm pretty sure it's a bug in the chat UI and I tried running with upstream llama.cpp which did improve the situation but didn't fix it.

Well, as you mentioned in the other issue, upstream llama.cpp now has a fix that isn't in GPT4All yet.

The model input is perfectly fine. Something in the chat UI must be UBing,

From all I've seen it's quite likely to be caused by something within the way the template is applied tho: Specially behavior changing that dramatically by making the prompt shorter is quite suspicious

It's really weird how the output is largely different from what I'm seeing even in my own chat GUI. I'd expect that to be buggy, too, but it somehow isn't? Especially with Temperature set to 0. My earlier quoted text was taken from the GUI, not the Python script.

ThiloteE commented 4 months ago

Closing, as this issue is quite old.

According to the bloke at https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML#prompt-template-llama-2-chat, this is a working system prompt + prompt template:

System Prompt:

[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe.  Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature. If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

Prompt template:

[INST]{prompt}[/INST]

But obviously this is a very censored and convoluted system prompt, so I am not surprised about weird responses. Also, I think there might be issues with empty spaces and new lines. See https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGML/discussions/21. In conclusion: There are finetunes of llama-2 out there that are much better and by now, even Meta, the publisher of this model has released llama-3, which supercedes llama-2.

I don't know, if this issue can still be reproduced, but in GPT4All 2.8.0 the prompt template would need to be adapted anyway to incorporate the models response.

Maybe to something like this?

Prompt template:

[INST]%1[/INST]
[INST]%2[/INST]

cosmic-snow commented 4 months ago

Ah yes, IIRC there were problems and the cause wasn't immediately clear. It was not just the prompt, but also the implementation.

You're right to close this.

nomic-ai / gpt4all