Fix for backwards compatibility

Efficient-Large-Model / VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Apache License 2.0

878 stars 55 forks source link

Open michael-heinrich opened 1 month ago

michael-heinrich commented 1 month ago

Choose LlamaAttention instead of LlamaFlashAttention2, if flash attention is not supported by the GPU architecture. PR for #41