predibase / lorax

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
https://loraexchange.ai
Apache License 2.0
1.86k stars 125 forks source link

(WIP) Support targeting the embedding layer for LoRA #501

Open ajtejankar opened 3 weeks ago

ajtejankar commented 3 weeks ago

What does this PR do?

  1. Re-organize the code in BatchLoraWeights.load. This function was a bit hard to understand as there were multiple list comprehensions with almost same looping logic. So, merged all of them into two loops for improved clarity. @tgaddair Can you confirm if this looks good? I can revert back to the original code in case this change can cause problems.
  2. (WIP) Support embedding layer as a target module. This is mostly done except multi-gpu inference.
ajtejankar commented 3 weeks ago

@tgaddair I am pushing a partially done commit that supports embedding layer loras.