vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
31.3k stars 4.75k forks source link

[Feature]: Serving VLM VILA #10889

Open anhnhust opened 11 hours ago

anhnhust commented 11 hours ago

🚀 The feature, motivation and pitch

Hello,

I want to deploy the VILA model for serving (https://github.com/NVlabs/VILA). Could you please guide me on how to get started? Are there any specific instructions or tools I should follow for setting up the serving environment?

Alternatives

No response

Additional context

No response

Before submitting a new issue...

DarkLight1337 commented 10 hours ago

Does this model have the same architecture as LLaVA? If it's not an existing architecture implemented in vLLM, then you have to follow this guide to do it yourself.