Support for Lora adapter layers for LLM inference

wangzhaode / mnn-llm

llm deploy project based mnn.

Apache License 2.0

1.46k stars 159 forks source link

Support for Lora adapter layers for LLM inference #181

Closed TejasRavichandran1995 closed 5 months ago

TejasRavichandran1995 commented 6 months ago

the llm-export utility https://github.com/wangzhaode/llm-export seems to have support to directly export a lora.mnn file during conversion in the llm_export.py .

However , it seems to me the framework does not yet support inference with lora.mnn exported file. Any pointers regarding this would be useful :). @wangzhaode

wangzhaode commented 6 months ago

MNN-2.9.0 will support apply lora on device. But now there are some accuracy problems caused by quantization.

TejasRavichandran1995 commented 6 months ago

Sure. Thanks @wangzhaode . Any planned rough timelines on the 2.9.0 release?

github-actions[bot] commented 5 months ago

Marking as stale. No activity in 30 days.