Open saikatscalers opened 5 days ago
@saikatscalers Feature support for mllama is still limited, have you tried the offline inference example first? You can see it uses some special arguments here https://github.com/vllm-project/vllm/blob/fd47e57f4b0d5f7920903490bce13bc9e49d8dba/examples/offline_inference_vision_language.py#L291-L296
python examples/offline_inference_vision_language.py -m mllama
Thanks @mgoin,
We were able to run the offline inference for mllama 3.2 11B vision Instruct model, but we want it to work with vllm_serving, Is there a way to do it as of now?
Thanks, @saikatscalers
@saikatscalers yeah, see https://github.com/vllm-project/vllm/issues/8826 as well for more info.
You need to set those arguments as CLI parameters to vllm serve
Your current environment
System Specifications:
Model Input Dumps
N/A
🐛 Describe the bug
When Llama-3.2-11B-Vision-Instruct is ran on the machine specified above using the latest vllm openai docker container using this command:
We face this error:
Before submitting a new issue...