microsoft / LoRA

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
https://arxiv.org/abs/2106.09685
MIT License
10.78k stars 688 forks source link

[Embedding-bugfix]: reset_parameters #158

Open WrRan opened 9 months ago

WrRan commented 9 months ago

initialize A the same way as the default for nn.Linear and B to zero

But why not use nn.init.kaiming_uniform_(self.lora_A, a=math.sqrt(5)) as class Linear?