microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
https://arxiv.org/abs/2106.09685
MIT License
10.43k stars 669 forks source link

LORA on T5 model #169

Open vivektreddy opened 5 months ago

vivektreddy commented 5 months ago

I am trying to use LORA on a loaded Checkpoint of a CodeT5 model. However when I do, the run time is about the same, and my result is not as good as when I finetune the whole thing. Am I intializing the model properly?

rank=16 lora_alpha=4 lora_dropout=0.05 model = T5ForConditionalGeneration.from_pretrained("Salesforce/codet5-small")

freeze parameters

for name, param in model.named_parameters(): param.requires_grad = False lora_config = LoraConfig(inference_mode=False, r=rank, target_modules=['q', 'v'], lora_alpha=lora_alpha, lora_dropout=lora_dropout) lora_model = LoraModel(model, lora_config, "default")

Thank you