mlc-ai / mlc-llm

Universal LLM Deployment Engine with ML Compilation
https://llm.mlc.ai/
Apache License 2.0
18.42k stars 1.47k forks source link

[Question] Multiple lora support. #2625

Open lumiere-ml opened 1 month ago

lumiere-ml commented 1 month ago

❓ General Questions

Hi, I have done a quick review of this repo, but I didn't find anything about lora, I'm wondering Does this repo support Lora serving, and more deeper support multi lora serving such as Punica or Slora. If not do we have any plan to implemnent it ?

Thanks

lhwlhw90 commented 1 month ago

I implement a simple multi lora inference in mlc llm,but does not support batch infer, because I can't write the bgmv or sgmv function in relax and tensor ir. sgmv is a dynamic graph.

lhwlhw90 commented 1 month ago

I modify the prefill func and decode func by pass lora weight in layer forward, and then add new items in mlc llm func table, so some hard code on models is needed.

lumiere-ml commented 1 month ago

I modify the prefill func and decode func by pass lora weight in layer forward, and then add new items in mlc llm func table, so some hard code on models is needed.

you have done a very good job, do you have any pull request?