Open y-rok opened 4 months ago
I believe the conv_llava_llama_3 function in conversation.py should look like the following code:
conv_llava_llama_3 = Conversation(
system="You are a helpful language and vision assistant. " "You are able to understand the visual content that the user provides, " "and assist the user with a variety of tasks using natural language.",
roles=("user", "assistant"),
version="llama_v3",
messages=[],
offset=0,
sep_style=SeparatorStyle.LLAMA_3,
tokenizer_id="meta-llama/Meta-Llama-3-8B-Instruct",
tokenizer=llama3_tokenizer,
stop_token_ids=[128009],
)
roles=("user", "assistant"), instead of roles=("<|start_header_id|>user", "<|start_header_id|>assistant")" )
https://github.com/LLaVA-VL/LLaVA-NeXT/blob/inference/docs/LLaVA-NeXT.md
In this example, your code generate double "<|start_header_id|>" in front of "user" for the prompt_question variable. Could you check if there is any mistake in your code.
Below is the value of prompt_question.
<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nYou are a helpful language and vision assistant. You are able to understand the visual content that the user provides, and assist the user with a variety of tasks using natural language.<|eot_id|><|start_header_id|><|start_header_id|>user<|end_header_id|>\n\n<image>\nWhat is shown in this image?<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n