guidance-ai / guidance

A guidance language for controlling large language models.
MIT License
18.8k stars 1.04k forks source link

support for transformer LoRA adapters #931

Open jonberliner opened 3 months ago

jonberliner commented 3 months ago

Hi! Thanks so much for this awesome package. I would like to use a transformers model with a LoRA adapter attached using the peft package. Currently, it seems that it can only use a regular transformers model

I would like to pass the path to a LoRA adapter and load the base model with the attached adapter.

Currently, I can merge the adapter and save the model, but this is not ideal, as I have many adapters I would like to attach to the same base model.

Again, thanks so much for this awesome project!

Harsha-Nori commented 3 months ago

Hey @jonberliner,

Thanks for the great suggestion! We'll definitely add this to the backlog. I haven't played around much with peft before -- do people typically use it just to load LoRA style adapters for inference? I'm wondering if we can just extend the guidance.models.Transformers loader to take in a LoRA adapter path, or do you think it would be better to have an e.g. guidance.models.Peft class to handle more complexity and customization?

jonberliner commented 2 months ago

@Harsha-Nori

Thanks for the response! Yes, typically people are loading LoRA-style adapters with peft, along with setting up adapters for training which is beyond the scope of guidance.

Setting up transformers to take an adapter path would be great. Even better would be the ability to load a base model with guidance.models.Transformers, and then load/unload adapters on top of the base model for rapid switching. Perhaps that would be a guidance.models.Peft class initialized with a base model or a path to a base model, along with a path to the adapter.

hudson-ai commented 2 months ago

@jonberliner thank you for the input here! From a user experience perspective, do you feel that having an exposed interface for adding adaptors to guidance.models.Transformers objects directly is substantially more streamlined than, say, importing the relevant Hugging Face bits, constructing models and adding adaptors there, and then just passing your final model object to a guidance wrapper?

e.g.

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig

model_id = "microsoft/phi-2"
model = AutoModelForCausalLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

lora_config = LoraConfig(
    target_modules=[ "q_proj", "k_proj", "v_proj", "dense" ],
    init_lora_weights=False
)

model.add_adapter(lora_config, adapter_name="adapter_1")
import guidance
guidance_model = guidance.models.Transformers(model=model, tokenizer=tokenizer)

@Harsha-Nori I'm a little bit concerned about the slippery slope of having to reimplement (and maintain) the entire API of an external model-provider, and my knee-jerk reaction is to suggest we only mirror the core parts and recommend "wrapping" anything else like in my above example. Thoughts?

riedgar-ms commented 2 months ago

If transformer_model.add_adapter() leaves the model still useable by Guidance, I think that's a better solution. We're having enough trouble just getting a variety of 'bare' models supported, and embedding an extra API on top of that would be a lot of complex and error-prone work.

hudson-ai commented 2 months ago

@riedgar-ms can confirm it works for me and totally agreed.

@jonberliner is there anything that the approach I outlined above misses?