Open sven-knoblauch opened 3 weeks ago
for this to work, the vllm docker needs --lora-modules name1=/path/to/adapter1 name2=hfuser/adapter2
.
So you see, the adapter path either can be a local path or huggingface model.
since there's no way to mount files, a simple way would be to allow adding adapters from huggingface like name2=hfuser/adapter2
that vllm docker then downloads form huggingface automatically.
When the inference requests contains model=name3, the vllm openai docker downloads the corresponding lora adapter from huggingface and loads it.
yeah, for the "standard" vllm docker image you can load it with --lora-modules, but how do you do so in the runpod serverless vllm worker image. There you can't add this to the CMD of the docker, only thing i can do is add Environement variables, and there is no option to add the lora-modules. That's as far as i understand it.
When trying to configure a lora adapter, the ENV vars for enabling lora and other settings are exposed (also in runpod UI), but there is no option for adding the actual lora-modules (paths to the lora adapters/huggingface link).
In src/engine.py in the class OpenAIvLLMEngine is the option for adding these lists (line 137, 145).
As far as i saw in the vllm github page, the list should be like that:
lora_modules: Optional[List[LoRAModulePath]]
Without these lora-modules, all other lora settings seem to be useless.