The question about convergence speed

QingruZhang / AdaLoRA

AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning (ICLR 2023).

MIT License

231 stars 23 forks source link

Hello, thanks for your question. Yes, the convergence of AdaLoRA should be slightly slower than LoRA because AdaLoRA goes through three stages of budget scheduler: initial warmup, budget decreasing phrase, and final fine-tuning. If there are too few steps in the budget decreasing, the rank will be changed too quickly to adapt smoothly. As our ovservation, as long as we chose a propoer budget scheduler, the convergence of AdaLoRA should be similar with LoRA. For example, setting the total step of budget decaresing phrase as the half of total training steps.

QingruZhang / AdaLoRA

The question about convergence speed #19