Efficient-Large-Model / VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Apache License 2.0
878 stars 55 forks source link

Fix for backwards compatibility #44

Open michael-heinrich opened 1 month ago

michael-heinrich commented 1 month ago

Choose LlamaAttention instead of LlamaFlashAttention2, if flash attention is not supported by the GPU architecture. PR for #41