ggerganov / llama.cpp

LLM inference in C/C++
MIT License
67.52k stars 9.7k forks source link

Tracking: LoRA #964

Closed jon-chuang closed 7 months ago

jon-chuang commented 1 year ago

Here are some outstanding issues for LoRA:

captainzero93 commented 1 year ago

really desperate to start uing LoRA, however I use GPTQ-4bit-32g.GGML will this be a problem?

jon-chuang commented 1 year ago

So far, we've seen issues with quality on 4 bit base model. That being said, it has produced reasonable output for me some of the time. It is still under investigation.

bmanturner commented 1 year ago

Would this be a good place to request support for multiple lora adapters sharing a similar base model? See here for inspiration: https://github.com/lm-sys/FastChat/pull/1905

Green-Sky commented 1 year ago

Improve LoRA loading time with MMAP on base model

was done here https://github.com/ggerganov/llama.cpp/pull/2095

also not sure this issue is the right one

github-actions[bot] commented 7 months ago

This issue was closed because it has been inactive for 14 days since being marked as stale.