Plan to support latest vLLM?

rhymes-ai / Aria

Codebase for Aria - an Open Multimodal Native MoE

Apache License 2.0

829 stars 70 forks source link

Plan to support latest vLLM? #76

Open thanhnguyentung95 opened 3 hours ago

thanhnguyentung95 commented 3 hours ago

Due to the following code in vLLM 0.6.2, we cannot serve LoRA adapters on a per-request basis, as it does not support LoRA and Multimodal simultaneously:

        if self.lora_config:
            assert supports_lora(self.model), "Model does not support LoRA"
            assert not supports_multimodal(
                self.model
            ), "To be tested: Multi-modal model with LoRA settings."

Do you have a plan to upgrade the vLLM version for Aria?

xffxff commented 3 hours ago

@thanhnguyentung95 Yes, we can upgrade vLLM. I'll take a look at this.

xffxff commented 2 hours ago

Hi @thanhnguyentung95 You can try upgrading vLLM in your local environment and continue development first. There are several indirect dependencies (like PyTorch, transformers), so I'll need some extra time to test the upgrade thoroughly to ensure it doesn't break any existing Aria functionality.