mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT License
2.56k stars 208 forks source link

Inquiry about GPU memory usage of VILA 1.5-3b AWQ model for 12 frames video. #240

Open gj-raza opened 1 week ago

gj-raza commented 1 week ago

Hi, if anyone has successfully run VILA 1.5-3b AWQ for videos, can you please inform whats the VRAM consumption for a single 12 frame video clip? thanks