NVlabs / VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Apache License 2.0
972 stars 68 forks source link

added functionality to process a bunch of videos at a time #75

Closed poorfrombabylon closed 2 weeks ago