yuanzhoulvpi2017 / zero_nlp

中文nlp解决方案(大模型、数据、模型、训练、推理)
MIT License
3.03k stars 368 forks source link

train_llava 构建模型失败 #191

Closed weiaicunzai closed 2 weeks ago

weiaicunzai commented 3 weeks ago

谢谢大佬的开源。我想请教一下,我按照你的步骤,自己写了一下,然后用图像测试,结果跟我说不明白我在说啥。 请问可能是哪些原因造成的呢?

我输入的prompt打印出来是这样:

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
<image>
What are these?<|im_end|>
<|im_start|>assistant

然后llava给我的反馈是 I'm sorry, but I don't understand what you're asking. Can you please provide more context

以下是我的test 代码:

def test():
    model_path = '/mnt/dolphinfs/ssd_pool/docker/user/hadoop-mlm/by/train_llava/pretrained_model/model001'

    llava_processor = LlavaProcessor.from_pretrained(model_path)
    llava_tokenizer = AutoTokenizer.from_pretrained(model_path)
    llava_model = LlavaForConditionalGeneration.from_pretrained(model_path)

    prompt_text = "<image>\nWhat are these?"

    messages = [
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": prompt_text},
    ]

    prompt = llava_processor.tokenizer.apply_chat_template(
        messages, tokenize=False, template="{role}\n{content}", add_generation_prompt=True
    )

    image_path = "/mnt/dolphinfs/ssd_pool/docker/user/hadoop-mlm/by/train_llava/000000039769.jpg"
    image = Image.open(image_path)

    inputs = llava_processor(text=prompt, images=image, return_tensors="pt")

    # print(inputs)
    # for tk in inputs.items():
        # print(tk)
    for tk in inputs.keys():
        inputs[tk] = inputs[tk].to(llava_model.device)

    generate_ids = llava_model.generate(**inputs, max_new_tokens=20)
    # print(generate_ids)

    gen_text = llava_processor.batch_decode(
        generate_ids, skip_special_tokens=False, clean_up_tokenization_spaces=False
    )[0]

    print(gen_text)
yuanzhoulvpi2017 commented 3 weeks ago

hi,我不太懂,你这里为什么要传递template="{role}\n{content}"

希望可以和我的代码保持一致~

prompt = llava_processor.tokenizer.apply_chat_template(
        messages, tokenize=False, template="{role}\n{content}", add_generation_prompt=True
    )

另外,也可以试一试我之前训练过的模型,然后使用你的这份代码做一下推理试一试~,具体差异,得辛苦你对比一下了

weiaicunzai commented 3 weeks ago

hi,我不太懂,你这里为什么要传递template="{role}\n{content}"

希望可以和我的代码保持一致~

prompt = llava_processor.tokenizer.apply_chat_template(
        messages, tokenize=False, template="{role}\n{content}", add_generation_prompt=True
    )

另外,也可以试一试我之前训练过的模型,然后使用你的这份代码做一下推理试一试~,具体差异,得辛苦你对比一下了

感谢回复。和你的代码不同的地方是,我用了qwen1.5 0.5b(Qwen/Qwen1.5-0.5B-Chat)。以及clip vit32 (openai/clip-vit-base-patch32)

我把我的代码中的 “template="{role}\n{content}" 去掉以后,还是回复我 “I'm sorry, but I don't understand what you're asking. Can you please provide more context”

我把你的代码(同样用我选择的模型,不是你的代码中的模型)中的“template="{role}\n{content}" 加上和去掉,都是回复我。 I'm sorry, but I don't understand what you're asking. Can you please provide more context

weiaicunzai commented 2 weeks ago

这个bug已经被我修复。

我应该是tokenizer_config.json 文件之前不小心把clip的这个文件复制过来了,而clip的tokenizer_config.json没有chat_template 这个选项。

我改过来以后,发现已经可以正常回答了,如下: I'm sorry, but I cannot describe an image as I am a text-based AI language model and