vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
31.16k stars 4.73k forks source link

[Bug]: Multi Lora Bug with different LORAS #7169

Open devlup opened 3 months ago

devlup commented 3 months ago

Your current environment


In this example https://github.com/vllm-project/vllm/blob/main/examples/multilora_inference.py  if you gave two different loras instead of same loras with different name, the output is coming from the first lora initialized ```

### 🐛 Describe the bug

give two  any two different loras and do inference, the model is pocking only the first lora in order , the second one inference also coming from the first lora(
            "[user] Write a SQL query to answer the question based on the table schema.\n\n context: CREATE TABLE table_name_74 (icao VARCHAR, airport VARCHAR)\n\n question: Name the ICAO for lilongwe international airport [/user] [assistant]", 
            SamplingParams(temperature=0.0,
                           logprobs=1,
                           prompt_logprobs=1,
                           max_tokens=128,
                           stop_token_ids=[32003]),
            LoRARequest("sql-lora2", 2, lora_path)),
        (
            "my nam is", 
            SamplingParams(n=3,
                           best_of=3,
                           use_beam_search=True,
                           temperature=0,
                           max_tokens=128,
                           stop_token_ids=[32003]),
            LoRARequest("sql-lora", 1, 'timdettmers/qlora-flan-7b')),
devlup commented 3 months ago

@Yard1 @jvmncs @rkooo567

github-actions[bot] commented 4 weeks ago

This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!