VILA-1.5 details - Githubissues

Efficient-Large-Model / VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Apache License 2.0

877 stars 55 forks source link

Closed Lopa07 closed 1 month ago

Lopa07 commented 1 month ago

hkunzhe commented 1 month ago

You can look at the model configuration files on Hugging Face or the training code in the repository.

Lopa07 commented 1 month ago

Sorry, I can not find these details. It will be very helpful, if you please post these information here for better visibility.

yaolug commented 1 month ago

You can look at the training scripts under https://github.com/Efficient-Large-Model/VILA/tree/main/scripts/v1_5/release You can refer to the technical details from the original paper. https://arxiv.org/pdf/2312.07533 We made Section 4.4 the default now.

Lopa07 commented 1 month ago

Thank you both! This helped.