microsoft / LLaVA-Med

Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.
Other
1.48k stars 181 forks source link

Excuse me, I deployed LLaVA-med locally, why is my answer only one word? #86

Closed ZG-yuan closed 1 month ago

ZG-yuan commented 2 months ago

Some weights of the model checkpoint at /mnt/disk/zgy/LLaVA-Med/LLaVA-Med-main/llava-med-v1.5-mistral-7b were not used when initializing LlavaMistralForCausalLM: ['model.vision_tower.vision_tower.vision_model.encoder.layers.19.mlp.fc1.weight',..........] USER: is heart CT? The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. Setting pad_token_id to eos_token_id:2 for open-end generation. ASSISTANT: Yes USER: is leg CT? The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. Setting pad_token_id to eos_token_id:2 for open-end generation. ASSISTANT: No USER:What is it? The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. Setting pad_token_id to eos_token_id:2 for open-end generation. ASSISTANT: This

Rickylht commented 1 month ago

You need to change the template in conversation.py image replace the sep="" as sep="<s>". Otherwise, the mode will regard blank space as stop token. Then everything goes well. image Strange bug solved 🕶️

ZG-yuan commented 1 month ago

@Rickylht Thank you very much!