mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT License
2.55k stars 207 forks source link

No video inference code #227

Open Closertodeath opened 1 month ago

Closertodeath commented 1 month ago

The only instance of video inference seems to be in the local gradio demo. Is there code for video inference that I'm missing? I want to run VILA video inference from CLI.