vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
23.11k stars 3.28k forks source link

Please add lora support for higher ranks and alpha values #2847

Open parikshitsaikia1619 opened 5 months ago

parikshitsaikia1619 commented 5 months ago

ValueError: LoRA rank 64 is greater than max_lora_rank 16.

SuperBruceJia commented 4 months ago

Mark

Peter-Devine commented 4 months ago

Bump

dspoka commented 4 months ago

It's not super well documented but you need to just pass in "-max-lora-rank 64" or whatever when serving since default is 16.

python -m vllm.entrypoints.openai.api_server --max-lora-rank 64 \ --model model_name\ --enable-lora \ --lora-modules lora-name=lora_path

spreadingmind commented 3 months ago

It's not super well documented but you need to just pass in "-max-lora-rank 64" or whatever when serving since default is 16.

python -m vllm.entrypoints.openai.api_server --max-lora-rank 64 --model model_name --enable-lora --lora-modules lora-name=lora_path

Thanks for the answer, it helped me as well. For those who use code, it would be here:

          llm = LLM(
            model=args.model, tensor_parallel_size=torch.cuda.device_count(), 
            dtype=args.dtype, trust_remote_code=True, enable_lora=True, max_lora_rank=64
        )
Napuh commented 3 months ago

Both answers work for me, up to rank 64. Rank > 64 is not supported yet.

See #3934

patrickrho commented 1 month ago

Can we get Lora rank > 64 supported and merged?

edit: im also curious if this was by design to support up to 64 rank, if so please let me know

kevinjesse commented 1 month ago

Bump. I need adapters that are much, much larger to be supported. Thanks

jiangjin1999 commented 2 weeks ago

Is there something special about lora rank >64. Wonder why only <=64 is supported