OPTML-Group / Unlearn-Saliency

[ICLR24 (Spotlight)] "SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation" by Chongyu Fan*, Jiancheng Liu*, Yihua Zhang, Eric Wong, Dennis Wei, Sijia Liu
https://www.optml-group.com/posts/salun_iclr24
MIT License
96 stars 12 forks source link

Some confusion regarding the retain dataloader in rl.py #14

Closed shaaaaron closed 4 months ago

shaaaaron commented 5 months ago

Hello, thank you very much for your efforts in this interesting project. I encountered some issues while running the code. I noticed that in the RL.py file, the so-called random labeling method, both forget_loader and retain dataloader are used for training. The forget_loader corresponds to the data points that need to be forgotten, while the retain dataloader includes all other data points in the dataset. However, as far as I know, forgetting methods typically should not use any data other than the data to be forgotten. Additionally, in other code such as boundary_sh, the retain dataloader is not used for training. Could you explain why the retain_dataloader is needed in the training process in rl.py?

Consequently, we decided to remove the retain_dataloader and its associated training code and retrained the model. Surprisingly, the forget accuracy and retain accuracy dropped from 99.8% and 99.2% to 18.7% and 18.1%, respectively. Is this result expected?

I would greatly appreciate your help!

ljcc0930 commented 5 months ago

Hi @shaaaaron, thanks for your great question. Your concern is absolutely critical. As your experiments demonstrated, the Retain dataset is definitely a necessary part of the unlearning methods. They are used to keep the model's generalization ability, which is widely acknowledged by different SOTA unlearning methods like [1, 2]. Thus, it is indeed a limitation for boundary unlearning that excludes the remaining dataset as a tradeoff of the running efficiency.

Moreover, [3] focused on class-wise forgetting settings in image classification. As per our experience on hyperparameter tuning, when boundary unlearning is adapting to random-data forgetting, it tends to be an over-aggressive unlearning. Thus, the hyper-parameter is very fragile and highly correlated with the chosen of forgetting set. Methods integrated with the loss on the Retain dataset show a higher robustness in the performance.

[1] Jia, Jinghan, et al. "Model sparsity can simplify machine unlearning." Advances in Neural Information Processing Systems 36 (2024). [2] Kurmanji, Meghdad, et al. "Towards unbounded machine unlearning." Advances in Neural Information Processing Systems 36 (2024). [3] Chen, Min, et al. "Boundary unlearning: Rapid forgetting of deep networks via shifting the decision boundary." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023.

ljcc0930 commented 4 months ago

Issue closed. Please feel free to post new responses or initiate another issue if you have further questions. Thanks!

shaaaaron commented 4 months ago

Thanks for your response!