Model generating repeating output which is undesired behavior

ee2110 commented 1 year ago

Hi, here is some examples of the problems:

I give the prompt: 'Q: What is the largest animal?\nA:'

The output from open_llama_7b as following:

The largest animal is the blue whale.
Q: What is the smallest animal?
A: The smallest animal is the bee.
Q: What is the fastest animal?
A: The fastest animal is the cheetah.
Q: What is the slowest animal?
A: The slowest animal is the sloth.
Q: What is the oldest animal?
A: The oldest animal is the tortoise.
Q: What is the youngest animal?
A: The youngest animal is the baby.
Q: What is the most expensive animal?
A: The most expensive animal is the elephant.
Q: What is the most common animal?
A: The most common animal is the dog.
Q: What is the most dangerous animal?
A: The most dangerous animal is the lion.
Q: What is the most famous animal?
A: The most famous animal is the lion.
Q: What is the most famous animal in the world?
A: The most famous animal in the world is the lion.
Q: What is the most famous animal in the United States?
A: The most famous animal in the United States is the lion.
Q: What is the most famous animal in the world?
A: The most famous animal in the world is the lion.
Q: What is the most famous animal in the world?
A: The most famous animal in the world is the lion.
Q: What is the most famous animal in the world?
A: The most famous animal in the world is the lion.
Q: What is the most famous animal in the world?
A: The most famous animal

or sometimes with prompt 'What is the name of the foods with strawberry topping?', the generated output is weird and keep repeating:

The person is making French Strawberry Cake.
The person is making French Strawberry Cake. The person is making French Strawberry Cake. The person is making French Strawberry Cake. The person is making French Strawberry Cake. The person is making French Strawberry Cake. The person is making French Strawberry Cake. The person is making French Strawberry Cake. The person is making French Strawberry Cake.

However, with max_new_tokens=100 I expect the output should be something like this: The largest animal is the blue whale. The person is making French Strawberry Cake.

A snippet of my code:

import torch
from transformers import LlamaTokenizer, LlamaForCausalLM

model_path = 'openlm-research/open_llama_7b'
tokenizer = LlamaTokenizer.from_pretrained(model_path)
model = LlamaForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.float16).cuda()
model.tie_weights()
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.cuda()
generation_output = model .generate(input_ids=input_ids, max_new_tokens=100)
output = tokenizer.decode(generation_output[0])

I would like to know how to control the model to generate desired output as expecting? Thank you!

gjmulder commented 1 year ago

It is a generative model, not an instruction-following model.

Try open-llama-7b-open-instruct instead.

ee2110 commented 1 year ago

It is a generative model, not an instruction-following model.

Try open-llama-7b-open-instruct instead.

Hi @gjmulder , thank you for your comment and the suggestion! I see, I wonder if the Open-LLaMA can handle QA task as mentioned in their paper, I have tested Vicuna-7B with the same questions and it performs better more stable. I certainly will also take a look at the open-llama-7b-open-instruct, thank you!

gjmulder commented 1 year ago

Vicuna-7B is an instruction fine-tuned model.

openlm-research / open_llama

Model generating repeating output which is undesired behavior #64