PKU-YuanGroup / Chat-UniVi

[CVPR 2024 Highlight🔥] Chat-UniVi: Unified Visual Representation Empowers Large Language Models with Image and Video Understanding
https://arxiv.org/abs/2311.08046
Apache License 2.0
755 stars 41 forks source link

Question about llama_flash_attn_monkey_patch #35

Closed mmmwhy closed 3 months ago

mmmwhy commented 4 months ago

https://github.com/PKU-YuanGroup/Chat-UniVi/blob/main/ChatUniVi/train/llama_flash_attn_monkey_patch.py is different with https://github.com/haotian-liu/LLaVA/blob/main/llava/train/llama_flash_attn_monkey_patch.py

for example:

https://github.com/PKU-YuanGroup/Chat-UniVi/blob/main/ChatUniVi/train/llama_flash_attn_monkey_patch.py image and

https://github.com/haotian-liu/LLaVA/blob/main/llava/train/llama_flash_attn_monkey_patch.py image

it seems chat-univi change some code in llama_flash_attn_monkey_patch, can you help explain the reason for modifying the code? ♥️

jpthu17 commented 4 months ago

We use standard multi-head attention. Since LLaMA 3 uses grouped-query attention, we guess that LLaVA made changes following LLaMA 3. (The main purpose of grouped-query attention is to reduce KV cache.) image