cccntu / minLoRA

minLoRA: a minimal PyTorch library that allows you to apply LoRA to any PyTorch model.
MIT License
432 stars 29 forks source link

Understanding the forward operation #8

Open evertonaleixo opened 1 year ago

evertonaleixo commented 1 year ago

I noticed that you apply the mul operation in LoraA and LoraB, then, you sum the result with the input.

image

I think the result of multiplying LoraA and LoraB has to be summed to the original weights, or I am wrong?

Could you also explain the scaling factor?

Thanks.

cccntu commented 1 year ago

I noticed that you apply the mul operation in LoraA and LoraB, then, you sum the result with the input. I think the result of multiplying LoraA and LoraB has to be summed to the original weights, or I am wrong?

This is the mechanism of torch.parametrizations https://pytorch.org/tutorials/intermediate/parametrizations.html

Could you also explain the scaling factor?

scaling follows the original implementation https://github.com/microsoft/LoRA It's mentioned in the paper. From my understanding it's not important, it's only there to control for the change of rank.