mit-han-lab / llm-awq

[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
MIT License
2.08k stars 150 forks source link

Can you provide examples code to run inference on video QA? #190

Open rebuttalpapers opened 1 month ago

rebuttalpapers commented 1 month ago

For the example in this page: https://github.com/mit-han-lab/llm-awq/tree/main/tinychat#usage You can easily inference on images: python vlm_demo_new.py \ --model-path VILA1.5-13b-AWQ \ --quant-path VILA1.5-13b-AWQ/llm \     --precision W4A16 \ --image-file /PATH/TO/INPUT/IMAGE \ --vis-image #Optional

However, how do you run video QA inference? Can you provide an example?

wublubdubdaxml commented 6 days ago

AWQ好像无法运行video吧