ggerganov / llama.cpp

LLM inference in C/C++
MIT License
66.14k stars 9.5k forks source link

How does "-ins"/Instruct/chat mode remembers contexts? How to properly feed it back as plain text? #1651

Closed aleksusklim closed 6 months ago

aleksusklim commented 1 year ago

I had a feeling that sometimes certain models are good at keeping context in chat/instruct conversation, but other models are bad.

To be sure that it is a quality of the model, but not llama.cpp's chat mode of main.exe, I tested the same Instruction template in koboldcpp.

To my surprise, it followed the conversation much better than my attempts in llama.cpp. I tried to find a clear example, where -ins breaks, but it was hard. It is not like the model have no memory at all, it more looks like it does not "respect" it much, having greater possibility to treat next instruction as a separate unrelated question.

Then I tried to put my dialogue to -f in.txt and continue it there with main.exe. It worked good! But it did not yield to the same results as in chat/instruct mode, even with zero temperature and the same seed.

Example command: (version llama-master-66874d4-bin-win-avx2-x64) main.exe -t 8 -c 2048 --temp 0 --top-p 0 --top-k 1 --seed 1 -m WizardLM-7B-uncensored.ggml.q5_1.bin --ins (looks like -i, --keep -1, -r "###" did not improve anything)

Chat that shows the issue:

> My favorite color is green. What's yours?
My favorite color is blue.
> "sky", "water", "sapphire" - what is similar in these things?
All of these things are shades of blue. They are all different types of blue, but they all share the color property of being blue.
> Give me examples of things of my favorite color.
Examples of things that match your favorite color would depend on what your favorite color is. Can you please specify your favorite color so I can provide more specific examples?
>

It seems to forgot the info from two prompt back (but often remembers info from directly preceding instruction).

When I tried to main.exe -t 8 -c 2048 --temp 0 --top-p 0 --top-k 1 --seed 1 -m WizardLM-7B-uncensored.ggml.q5_1.bin -f in.txt

If gave completely different (and kinda incorrect) response:

### Instruction:

My favorite color is green. What's yours?

### Response:

Green is a beautiful color, but it's not my favorite. What's your favorite color?

Above was with newlines at the beginning of the document, without them it would be:

### Instruction:

My favorite color is green. What's yours?

### Response:

Green is a great color! Is there anything specific you like to do with green?

Here is without linefeeds at all:

### Instruction:
My favorite color is green. What's yours?
### Response:
Green is a great color! Is there anything specific you want to know about it?

Anyway, if I craft my initial chat as document:

### Instruction:

My favorite color is green. What's yours?

### Response:

My favorite color is blue.

### Instruction:

"sky", "water", "sapphire" - what is similar in these things?

### Response:

All of these things are shades of blue. They are all different types of blue, but they all share the color property of being blue.

### Instruction:

Give me examples of things of my favorite color.

### Response:

It spits back a large list:

Sure! Here are some examples of things that are green:
- Leaves
- Grass
- Apples
- Grapes
- Pine trees
- Lettuce
- Mint
- Cucumbers
- Spinach
- Kale
- Broccoli
- Cauliflower
- Asparagus
- Bell peppers
- Zucchini
- Tomatoes
- Cabbage
- Brussels sprouts
- Carrots
- Beets
- Radishes
- Turnips
- Collard greens
- Mustard greens
- Kale
- Arugula
- Basil
- Thyme
- Sage
- Rosemary
- Oregano
- Parsley
- Cilantro
- Fennel
- Chives
- Garlic
- Onions
- Ginger
- Turmeric
- Cumin
- Coriander
- Cardamom
- Cloves
- Cinnamon
- Nutmeg
- Mace
- Vanilla extract
- Chocolate
- Green tea
- Matcha
- Jadeite jewelry
- Emeralds
- Sapphires
- Rubies
- Diamonds
- Zirconia
- Topaz
- Citrine
- Amethyst
- Turquoise
- Blue green algae
- Spirulina
- Chlorella
- Kaleidoscope
- Rainbow
- Unicorn
- Mermaid
- Dolphin
- Whale
- Ocean
- Beach
- Forest
- Grasslands
- Mountains
- Waterfalls
- Sunset
- Moonrise
- Sunset over the ocean
- Stars in the sky
- Birds singing
- Breeze blowing
- Flowers blooming
- Trees growing
- Food cooking
- Music playing
- Dancing
- Joyful laughter
- Happiness

It quickly became incorrect, but at least the model understood the question and used conversation history in its answer. Which is often not the case for -ins mode.

However, Instruct chats clearly have memory too:

> Give me two random words.
Sunshine and raindrops.
> Tell me more about second of them.
The second word is "raindrops". Raindrops are small droplets of water that fall from clouds during precipitation. They can be round or oval in shape and vary in size depending on the intensity of the rain. Raindrops can be seen as a symbol of sadness, as they often accompany bad weather and can bring about feelings of gloom and despair. However, they also have a beauty all their own, and can create stunning visual effects when illuminated by sunlight or artificial lighting. Overall, raindrops are a fascinating natural phenomenon that can evoke a range of emotions and thoughts in those who observe them.
> Tell me more about the first one.
The first word is "sunshine". Sunshine refers to the direct rays of the sun, which are warm and bright and provide energy for living organisms on Earth. It is an essential component of life as we know it, providing the necessary heat and light for photosynthesis and other vital processes. Sunshine also has a calming effect on people, bringing feelings of relaxation and contentment. Overall, sunshine is a crucial element that makes life on Earth possible.

If I omit "of them" it becomes:

> Give me two random words.
Sunshine and raindrops.
> Tell me more about second.
Second is the ordinal form of the number 2. It is used to describe a position or rank, often in reference to time or sequence. For example, "the second day of the week" refers to Sunday, which is the second day of the seven-day week. In sports, "second place" refers to the runner-up in a race or competition. In addition, "second base" is a position in baseball that refers to the second of four bases a player must reach before they can advance to third base. Overall, the word "second" is commonly used in many different contexts to describe a subsequent or following action or state.

But so does -f in.txt version too, it outputs:

### Instruction:

Give me two random words.

### Response:

Sunshine and raindrops.

### Instruction:

Tell me more about second.

### Response:

Second is the number two, which is a prime number and is divisible by 2. It is also the base of the decimal system and is used in many mathematical operations. Second can also refer to time, as in "second hand" on a watch or "seconds" in a game. In addition, it can be used as an adjective to describe something that comes after the first, such as "the second book in a series."

Still, the chat-mode has much stronger tendency to not respect conversation history! I cannot prove it, especially because it is different between models. But if anybody would explain, what exactly should I type in the RAW prompt to get the exact answers as I get with -ins – then I could experiment further and conclude why this is happens, and whether chat mode is buggy or not.

aleksusklim commented 1 year ago

Maybe the > character should be included too?

### Instruction:

> My favorite color is green. What's yours?

### Response:

My favorite color is blue.

Oh, at least it's blue!

### Instruction:

> My favorite color is green. What's yours?

### Response:

My favorite color is blue.

### Instruction:

> "sky", "water", "sapphire" - what is similar in these things?

### Response:

All of these things have a commonality of being related to the color blue. They are all shades or tones of blue.

And what if we…

…

### Instruction:

> Give me examples of things of my favorite color.

### Response:

Here are some examples of things that are typically associated with the color blue:

- Sky
- Water
- Sapphire
- Ocean
- Sea
- Blueberry
- Indigo
- Navy blue
- Cerulean
- Ultramarine
- Titanium
- Slate
- Lapis lazuli
- Cornflower blue
- Baby blue eyes
- Forest green

Well, I don't know. It is still unclear to me… No-newlines version:

### Instruction:
> My favorite color is green. What's yours?
### Response:
Green is a great color! Is there anything specific you want to know about it?
SlyEcho commented 1 year ago

It seems like it inserts the beginning-of-string token in front of ### Instruction: every time.

https://github.com/ggerganov/llama.cpp/blob/ffb06a345e3a9e30d39aaa5b46a23201a74be6de/examples/main/main.cpp#L211-L212

The true means add BOS.

It probably depends on the model but I think it affects the outcome a lot.

TonyWeimmer40 commented 1 year ago

I've noticed the same issue and in my experience -ins sets default n_keep to 2 even if set to --keep -1 or 0. -p does not have this problem.

github-actions[bot] commented 6 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.