microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
https://arxiv.org/abs/2106.09685
MIT License
10.43k stars 669 forks source link

Initialization for LoRA weights A and B initialized in the wrong way. #155

Open arun477 opened 8 months ago

arun477 commented 8 months ago

    def reset_parameters(self):
        nn.Embedding.reset_parameters(self)
        if hasattr(self, 'lora_A'):
            # initialize A the same way as the default for nn.Linear and B to zero
            # lora_A should be normal and lora_B should be zeros
            nn.init.zeros_(self.lora_A)
            nn.init.normal_(self.lora_B)