Open parikshitsaikia1619 opened 5 months ago
Mark
Bump
It's not super well documented but you need to just pass in "-max-lora-rank 64" or whatever when serving since default is 16.
python -m vllm.entrypoints.openai.api_server --max-lora-rank 64 \ --model model_name\ --enable-lora \ --lora-modules lora-name=lora_path
It's not super well documented but you need to just pass in "-max-lora-rank 64" or whatever when serving since default is 16.
python -m vllm.entrypoints.openai.api_server --max-lora-rank 64 --model model_name --enable-lora --lora-modules lora-name=lora_path
Thanks for the answer, it helped me as well. For those who use code, it would be here:
llm = LLM(
model=args.model, tensor_parallel_size=torch.cuda.device_count(),
dtype=args.dtype, trust_remote_code=True, enable_lora=True, max_lora_rank=64
)
Both answers work for me, up to rank 64. Rank > 64 is not supported yet.
See #3934
Can we get Lora rank > 64 supported and merged?
edit: im also curious if this was by design to support up to 64 rank, if so please let me know
Bump. I need adapters that are much, much larger to be supported. Thanks
Is there something special about lora rank >64. Wonder why only <=64 is supported
ValueError: LoRA rank 64 is greater than max_lora_rank 16.