OPTML-Group / Unlearn-Saliency

[ICLR24 (Spotlight)] "SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation" by Chongyu Fan*, Jiancheng Liu*, Yihua Zhang, Eric Wong, Dennis Wei, Sijia Liu
https://www.optml-group.com/posts/salun_iclr24
MIT License
80 stars 11 forks source link

Reproducing Baselines #11

Closed Dongjae0324 closed 3 months ago

Dongjae0324 commented 3 months ago

Hi, first of all, I personally like this work, and I am really happy to see some developments in Machine Unlearning field.

There are some questions about the baseline codes, First, Fisher Forgetting (FF) I know that the authors did not put FF as a baseline in their work, and I wonder if this is because of poor results from FF. I think the code is from 'Sparse' authors (which is also a great work), and I have failed to reproduce the results with FF. Alpha with 1e-9 didn't work for me. Are there any advice for implementing FF?! thank you!

Second, IU (aka woodfisher) In the paper it is written that hyper-parameter for IU is searched [1~20], however, for my implementation alpha under 1 worked (but not as much as paper results) however, alpha 5, 10, 15 didn't work for me. Can you provide more details related to this baseline? Let me know if I miss some critical contents in the paper. Thanks a lot!

Dongjae0324 commented 3 months ago

Hi, for the FF, I reproduced the results using the code from https://github.com/if-loops/selective-synaptic-dampening/blob/main/src/forget_full_class_strategies.py! However, for the wood fisher, I am still struggling! Could you provide some more details :)

a-F1 commented 3 months ago

Sorry for the delayed response. First and foremost, I sincerely appreciate your interest and support in our work. Regarding WoodFisher, my suggestion is to experiment with a broader range of alpha values within [1, 20]. For instance, you can start by testing with the following set of parameters: 1, 2, 5, 8, 10, 12, 15, 20 to further refine the alpha parameter range and determine the optimal alpha value. It's worth noting that there might be some errors in the process of using the first-order WoodFisher approximation to estimate the inverse-Hessian gradient product. Therefore, we provide the range [1, 20] for hyperparameter tuning.

Dongjae0324 commented 3 months ago

Thanks for your response. We forgot to make a comment,, 👍

a-F1 commented 3 months ago

You're welcome! If you have any more questions, please feel free to reach out to us anytime.