VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
Apache License 2.0
973
stars
68
forks
source link
The inference video reports an error: ValueError: Unable to create tensor, you should probably activate padding with 'padding=True' to have batched tensors with the same length. #78
Same here with trying
python -W ignore llava/eval/run_vila.py --model-path Efficient-Large-Model/Llama-3-VILA1.5-8b --conv-mode llama_3 --query "<image>\n Please describe the traffic condition." --image-file "demo_images/av.png"