Open signine opened 2 months ago
Hi @signine , thanks for your interest in VILA and AWQ. The newest quantized VILA-1.5 models are supported with TinyChat. Please refer to this page for instructions. You may find the usage section and VLM section helpful. Thank you!
@ys-2020, vlm_demo_new.py
in tinychat does not support video files.
@ys-2020, By referring to https://github.com/Efficient-Large-Model/VILA/blob/main/llava/eval/run_vila.py and https://github.com/mit-han-lab/llm-awq/blob/main/tinychat/serve/gradio_web_server.py, I successfully ran the inference with the video input in https://github.com/mit-han-lab/llm-awq/blob/main/tinychat/vlm_demo_new.py, but the output was not as effective as the online demo with the same parameters (VILA1.5-40b-AWQ; temperature 1.0; top-p 1.0; num_frames 8). Do you have any ideas?
Is it possible to run the AWQ models using the
run_vila.py
script?I ran the following command:
and got this error:
How can I run inference with the checkpoint in here? https://huggingface.co/Efficient-Large-Model/VILA1.5-3b-AWQ/tree/main/llm