bjzhb666 / GS-LoRA

Continual Forgetting for Pre-trained Vision Models (CVPR 2024)
https://arxiv.org/abs/2403.11530
MIT License
39 stars 2 forks source link

Notice about reproduce the results #9

Closed bjzhb666 closed 2 months ago

bjzhb666 commented 2 months ago

If you can not reproduce the results, a potential issue is the layers are slightly different, which is a small change caused by supporting more features. We conducted all the experiments using Linear but the code is MergedLinear now. Although they are equivalent in math, the implementation in Torch is different, and our hyperparameters are chosen in Linear setting, which may not be good for MergedLinear. So a simple way is to change MergedLinear into Linear. You can also find the proper hyperparameters using MergedLinear. We plan to find the proper hyperparameters using MergedLinear and update the code. 57f3f5642b8a17c6a8443e64ccb3a62