NVlabs VILA issues - Githubissues

NVlabs / VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Apache License 2.0

899 stars 60 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Update README.md

#85 hongxuyin opened 9 hours ago
0
docs: update README.md

#84 eltociear opened 14 hours ago
0
How VILA can handle 8 frames from videos?

#83 KangsanKim07 opened 22 hours ago
1
Problem training on zero2.json

#82 Davidup1 opened 23 hours ago
2
Update perception test eval script and results in README

#81 Xiuyu-Li closed 1 day ago
0
Whether this is a bug?

#80 jihaonew opened 1 day ago
2
Multi image inference quality

#79 oroojlooy opened 1 week ago
1
The inference video reports an error： ValueError: Unable to create tensor, you should probably activate padding with 'padding=True' to have batched tensors with the same length.

#78 changqinyao opened 1 week ago
2
Question about the output

#77 DwanZhang-AI opened 1 week ago
2
What is the --conv-mode of VILA1.5-13b?

#76 DwanZhang-AI closed 2 days ago
2
added functionality to process a bunch of videos at a time

#75 poorfrombabylon closed 2 weeks ago
0
OpenVLM leaderboard

#74 oroojlooy opened 2 weeks ago
1
VILA Context-length

#73 oroojlooy closed 2 weeks ago
2
Why setting LLaMa3's padding direction to "right"?

#72 ROIM1998 opened 4 weeks ago
1
Bug in conversation.py

#71 zhang-jr opened 1 month ago
1
Finetuning

#70 RohanR04 closed 1 month ago
0
About VILADistributedSampler and gradient_accumulation_steps

#69 dreamerlin opened 1 month ago
1
Access to pretrained model weights

#68 zzxslp opened 1 month ago
3
VILA-1.5 details

#67 Lopa07 closed 1 month ago
4
How does the VILA preprocessed video?

#66 MonolithFoundation opened 1 month ago
1
Does S2 able to unfreeze vit to train?

#65 MonolithFoundation closed 2 weeks ago
1
Fix vision engine build

#64 meenchen closed 1 month ago
0
What is the LLM used for VILA 1.5 40B?

#63 javier-m closed 1 month ago
1
math dataset incomplete description

#62 hubenjm opened 1 month ago
2
YouCook2 code to generate video clips from raw videos?

#61 hubenjm opened 1 month ago
4
RuntimeError: GET was unable to find an engine to execute this computation

#60 pribadihcr opened 1 month ago
1
No module named 'llava.tf_utils'

#59 pribadihcr closed 2 weeks ago
5
Would you consider releasing code that supports lora training 40b model?

#58 Key-lei opened 1 month ago
1
When will new annotations files be available?

#57 hubenjm closed 1 month ago
1
"No module named llava"

#56 vedantroy closed 1 month ago
1
How's the DownSampleBlock performance compare with CAbstractor?

#55 lucasjinreal opened 1 month ago
3
Potential bug in mm_utils.py process_image function

#54 hubenjm opened 1 month ago
1
working with VLLM

#53 kousun12 opened 1 month ago
2
How to evaluate 4shot?

#52 leexinhao opened 1 month ago
0
Running the AWQ models

#51 signine opened 1 month ago
3
Provide ShareGPT4V filtered annotations file

#50 hubenjm opened 1 month ago
0
About perception testset

#49 mary-0830 opened 1 month ago
3
Inference not working - Keyword tensor should have 2 or 3 dimensions, got 1

#48 signine opened 1 month ago
5
demo_trt_llm/convert_checkpoint.py - AttributeError: 'LlavaLlamaConfig' object has no attribute 'num_attention_heads'

#47 dimakan closed 1 month ago
3
Hi, Have you compare with s2 [384, 768] scales versus interpolate to 768x768?

#46 OpenJarvisAI opened 1 month ago
6
Add support for GPUs with compute capability lower than 8.0 for awq/kernels installation

#45 rahulthakur319 closed 2 days ago
1
Fix for backwards compatibility

#44 michael-heinrich closed 2 days ago
0
fix: PR #40 other bug.

#43 SeanCraven314 closed 1 month ago
4
Request for middle checkpoint

#42 jihaonew opened 2 months ago
3
Easy backwards compatibility fix

#41 michael-heinrich opened 2 months ago
3
fix: Fix tensor shape error, during llava inference.

#40 SeanCraven314 closed 1 month ago
1
Llama-3-VILA1.5-8B Inference error

#39 joebradly opened 2 months ago
11
Updated paper on the latest model (video understanding, etc.)

#38 thecooltechguy opened 2 months ago
4
Chamfer distance's data source

#37 threegold116 closed 1 month ago
2
Instruction for VILA 1.5 with tinychat (llm-awq) doesn't work well due to fixed torch version (==2.0.1)

#36 gigony opened 2 months ago
5