meta-llama / llama

Inference code for Llama models
Other
56.6k stars 9.59k forks source link

The response from meta-llama/Llama-2-7b-chat-hf ends with incomplete sentence when I am trying to get inference. #1088

Open YanjingRen opened 7 months ago

YanjingRen commented 7 months ago

I loaded meta-llama/Llama-2-7b-chat-hf into GPU, and tried to get response to a question. Here is the key part of the code:

def load_model(model_name, bnb_config):
    n_gpus = torch.cuda.device_count()
    max_memory = f'{40960}MB'

# Load a model, passing the token as an argument

    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        quantization_config=bnb_config,
        use_auth_token=huggingface_token,
        device_map="auto", # dispatch efficiently the model on the available ressources
        max_memory = {i: max_memory for i in range(n_gpus)},
    )
    tokenizer = AutoTokenizer.from_pretrained(model_name, use_auth_token=True, add_eos_token=True, use_fast=False, trust_remote_code=True)

    tokenizer.pad_token = tokenizer.eos_token
    tokenizer.pad_token_id = 18610
    tokenizer.padding_side = "right" # Fix weird overflow issue with fp16 training

    return model, tokenizer

model_name = "meta-llama/Llama-2-7b-chat-hf" 
bnb_config = create_bnb_config()
model, tokenizer = load_model(model_name, bnb_config)
B_S = "<s>"
B_INST, E_INST = "[INST]", "[/INST]"
B_SYS, E_SYS = "<<SYS>>\n", "\n<</SYS>>\n\n"
SPECIAL_TAGS = [B_INST, E_INST, "<<SYS>>", "<</SYS>>"]
system_prompt = "You are an helpful AI assistant, please answer this question:"
user_message = "How to achieve high grade in math for a first year student in high school?"
prompt = f"{B_S}{B_INST}{B_SYS}{(system_prompt).strip()} {E_SYS} {(user_message).strip()}{E_INST}\n\n",
input_ids = tokenizer(prompt, return_tensors="pt",return_attention_mask=False).input_ids.to('cuda')
outputs = model.generate(input_ids, max_length=512) 
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print("Before fine-tuning the response is: ",response)

The output as below:

[INST]<> You are an helpful AI assistant, please answer this question: <>

How to achieve high grade in math for a first year student in high school?[/INST]

  1. Practice consistently: Regular and consistent practice is essential to improve in math. Set aside a specific time each day to practice solving math problems, even if it's just for 15-20 minutes. You can use worksheets, online resources, or practice tests to help you.

  2. Understand the basics: Make sure you have a solid understanding of basic math concepts such as fractions, decimals, percentages, algebra, and geometry. Review these basics regularly, and practice working with simple problems to build your confidence.

  3. Break down problems: When solving math problems, break them down into smaller, manageable steps. This will help you understand the problem better and make it easier to solve.

  4. Seek help when needed: Don't be afraid to ask for help when you're struggling with a math concept or problem. You can ask your teacher, tutor, or classmate for assistance.

  5. Watch video tutorials: Watching video tutorials can help you visualize math concepts and problems better. You can find plenty of math video tutorials on websites such as Khan Academy, Mathway, or MIT OpenCourseWare.

  6. Take your time: Don't rush through math problems. Take your time to read the problem carefully, understand it, and work through it step by step.

  7. Use visual aids: Visual aids such as graphs, charts, and diagrams can help you understand complex math concepts better. Use them to visualize the problem and find a solution.

  8. Practice with real-world examples: Try to relate math concepts to real-world examples. This will help you understand how math is used in everyday life and make it more interesting.

  9. Stay organized: Keep all your math materials organized, including worksheets, notes, and textbooks. This will help you find what you need quickly and avoid wasting time searching for materials.

  10. Review regularly: Review math concepts regularly, even after you think you understand them. This will help you retain the information and avoid Why the response ends here not a complete sentence? How to solve this? Thank you!

YanjingRen commented 7 months ago

Can anybody help with this question? Thank you so much and appreciate it!!

Yunhao-Liu commented 4 months ago

outputs = model.generate(input_ids, max_length=512) 这里设置了最大的生成长度时512,到达这个长度会自动停止生成