train_llava 构建模型失败

weiaicunzai commented 3 weeks ago

谢谢大佬的开源。我想请教一下，我按照你的步骤，自己写了一下，然后用图像测试，结果跟我说不明白我在说啥。请问可能是哪些原因造成的呢？

我输入的prompt打印出来是这样：

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
<image>
What are these?<|im_end|>
<|im_start|>assistant

然后llava给我的反馈是 I'm sorry, but I don't understand what you're asking. Can you please provide more context

以下是我的test 代码：

def test():
    model_path = '/mnt/dolphinfs/ssd_pool/docker/user/hadoop-mlm/by/train_llava/pretrained_model/model001'

    llava_processor = LlavaProcessor.from_pretrained(model_path)
    llava_tokenizer = AutoTokenizer.from_pretrained(model_path)
    llava_model = LlavaForConditionalGeneration.from_pretrained(model_path)

    prompt_text = "<image>\nWhat are these?"

    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": prompt_text},
    ]

    prompt = llava_processor.tokenizer.apply_chat_template(
        messages, tokenize=False, template="{role}\n{content}", add_generation_prompt=True
    )

    image_path = "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-mlm/by/train_llava/000000039769.jpg"
    image = Image.open(image_path)

    inputs = llava_processor(text=prompt, images=image, return_tensors="pt")

    # print(inputs)
    # for tk in inputs.items():
        # print(tk)
    for tk in inputs.keys():
        inputs[tk] = inputs[tk].to(llava_model.device)

    generate_ids = llava_model.generate(**inputs, max_new_tokens=20)
    # print(generate_ids)

    gen_text = llava_processor.batch_decode(
        generate_ids, skip_special_tokens=False, clean_up_tokenization_spaces=False
    )[0]

    print(gen_text)

yuanzhoulvpi2017 commented 3 weeks ago

hi，我不太懂，你这里为什么要传递template="{role}\n{content}"

希望可以和我的代码保持一致～

prompt = llava_processor.tokenizer.apply_chat_template(
        messages, tokenize=False, template="{role}\n{content}", add_generation_prompt=True
    )

另外，也可以试一试我之前训练过的模型，然后使用你的这份代码做一下推理试一试～，具体差异，得辛苦你对比一下了

weiaicunzai commented 3 weeks ago

hi，我不太懂，你这里为什么要传递template="{role}\n{content}"

希望可以和我的代码保持一致～
prompt = llava_processor.tokenizer.apply_chat_template(
        messages, tokenize=False, template="{role}\n{content}", add_generation_prompt=True
    )
另外，也可以试一试我之前训练过的模型，然后使用你的这份代码做一下推理试一试～，具体差异，得辛苦你对比一下了

感谢回复。和你的代码不同的地方是，我用了qwen1.5 0.5b(Qwen/Qwen1.5-0.5B-Chat)。以及clip vit32 (openai/clip-vit-base-patch32)

我把我的代码中的 “template="{role}\n{content}" 去掉以后，还是回复我 “I'm sorry, but I don't understand what you're asking. Can you please provide more context”

我把你的代码（同样用我选择的模型，不是你的代码中的模型）中的“template="{role}\n{content}" 加上和去掉，都是回复我。 I'm sorry, but I don't understand what you're asking. Can you please provide more context

weiaicunzai commented 2 weeks ago

这个bug已经被我修复。

我应该是tokenizer_config.json 文件之前不小心把clip的这个文件复制过来了，而clip的tokenizer_config.json没有chat_template 这个选项。

我改过来以后，发现已经可以正常回答了，如下： I'm sorry, but I cannot describe an image as I am a text-based AI language model and

yuanzhoulvpi2017 / zero_nlp

train_llava 构建模型失败 #191