Efficient-Large-Model / VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Apache License 2.0
878 stars 55 forks source link

Possibility to support LLama-3? #31

Closed hzhang57 closed 2 months ago

hzhang57 commented 2 months ago

Thanks for sharing fantastic jobs, the VILA shows strong few-shot in context learning ability than original llava-1.5, will you plan to support LLama-3? The advantage of in context learning for vision could be expected.

Efficient-Large-Language-Model commented 2 months ago

yeah, the VILA1.5 we released today supports llama-3