Open kimborgen opened 1 year ago
As discussed in #12 , does LoRA on the fused kqv linear layer affect performance compared to splitting the layer into k, v, q Linear layers and applying LoRA adapter to them individually?
As discussed in #12 , does LoRA on the fused kqv linear layer affect performance compared to splitting the layer into k, v, q Linear layers and applying LoRA adapter to them individually?