Not attention mask is input for training

OpenBMB / MiniCPM-V

MiniCPM-Llama3-V 2.5: A GPT-4V Level Multimodal LLM on Your Phone

Apache License 2.0

7.82k stars 543 forks source link

Closed double-fire-0 closed 2 weeks ago

double-fire-0 commented 2 weeks ago

From the code from huggingface minicpmv, I notice that the attention_mask is None when calling llama3.forward function.

It will work fine when setting batch size to 1, but it seems that is not suitable for when batch size > 1.