TRAIS-Lab / dattri

`dattri` is a PyTorch library for developing, benchmarking, and deploying efficient data attribution algorithms.
https://trais-lab.github.io/dattri/
24 stars 8 forks source link

[dattri.algorithm] Influence Function LiSSA option #80

Closed TheaperDeng closed 3 months ago

TheaperDeng commented 3 months ago

Description

1. Motivation and Context

This PR is to add LiSSA as one of the ihvp solver in IFAttributor.

2. Summary of the change

  1. Add "lissa" as one of the ihvp solver choice in IFAttributor.
  2. Add a collate_fn parameter for ihvp_lissa and ihvp_lissa_at_x for nested input (in case the input is not tensor)

3. What tests have been added/updated for the change?

TheaperDeng commented 3 months ago

@tingwl0122 This PR is kind of "ugly", please also have a look.

TheaperDeng commented 3 months ago

Works well for MNIST+MLP noisy label detection

[(0, 0), (100, 58), (200, 93), (300, 103), (400, 105), (500, 109), (600, 109), (700, 109), (800, 110), (900, 110)]                                     
Checked Data Sample      Found flipped Sample     
--------------------------------------------------
0                        0                        
100                      58                       
200                      93                       
300                      103                      
400                      105                      
500                      109                      
600                      109                      
700                      109                      
800                      110                      
900                      110
tingwl0122 commented 3 months ago

Works well for MNIST+MLP noisy label detection

[(0, 0), (100, 58), (200, 93), (300, 103), (400, 105), (500, 109), (600, 109), (700, 109), (800, 110), (900, 110)]                                     
Checked Data Sample      Found flipped Sample     
--------------------------------------------------
0                        0                        
100                      58                       
200                      93                       
300                      103                      
400                      105                      
500                      109                      
600                      109                      
700                      109                      
800                      110                      
900                      110

Do you have a chance to do it in a larger setting?