fra31 / auto-attack

Code relative to "Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks"
https://arxiv.org/abs/2003.01690
MIT License
639 stars 111 forks source link

eps 8./255. works fine 4./255. does not work fine. #76

Closed jS5t3r closed 2 years ago

jS5t3r commented 2 years ago

I want to run the standard attack on different epsilons for the perturbations. It also works on different datasets except one.

my normalizaion:

mean:  [0.36015135049819946, 0.21252931654453278, 0.1168241947889328]
std :  [0.24773411452770233, 0.20017878711223602, 0.17963241040706
using standard version including apgd-ce, apgd-t, fab-t, square
initial accuracy: 91.60%
apgd-ce - 1/1 - 431 out of 458 successfully perturbed
robust accuracy after APGD-CE: 5.40% (total time 110.8 s)
Traceback (most recent call last):
  File "/home/user/adversialml/src/src/attacks.py", line 104, in <module>
    adv_complete, max_nr = adversary.run_standard_evaluation(x_test, y_test, bs=args.batch_size)
  File "/home/user/adversialml/src/src/submodules/autoattack/autoattack/autoattack.py", line 172, in run_standard_evaluation
    adv_curr = self.apgd_targeted.perturb(x, y) #cheap=True
  File "/home/user/adversialml/src/src/submodules/autoattack/autoattack/autopgd_base.py", line 682, in perturb
    res_curr = self.attack_single_run(x_to_fool, y_to_fool)
  File "/home/user/adversialml/src/src/submodules/autoattack/autoattack/autopgd_base.py", line 279, in attack_single_run
    loss_indiv = criterion_indiv(logits, y)
  File "/home/user/adversialml/src/src/submodules/autoattack/autoattack/autopgd_base.py", line 611, in dlr_loss_targeted
    x_sorted[:, -3] + x_sorted[:, -4]) + 1e-12)
IndexError: index -3 is out of bounds for dimension 1 with size 2

Any suggestions what to do?

fra31 commented 2 years ago

Hi,

how many classes does your dataset have? It seems that only 2 logits are available.

jS5t3r commented 2 years ago

it is binary. smiling=1 and not smiling=0

fra31 commented 2 years ago

The targeted DLR loss requires at least 4 classes (see also https://github.com/fra31/auto-attack/issues/70), then can't be used directly for binary classification problems. We've recently added flags for this with https://github.com/fra31/auto-attack/commit/45f497ad2f0bf835ff57fdda7df0babf27ada839, and I'll try to integrate a replacement for such cases soon.

jS5t3r commented 2 years ago

Thanks for the fast Response

jS5t3r commented 2 years ago

I am using a model with 4 classes. And I updated the AutoAttack. Now, I am getting this warning:

Warning: it seems that more target classes (9) than possible (3) are used in APGD-T! Also, it seems that too many target classes (9) are used in FAB-T (3 possible)!

Then this error:

using standard version including apgd-ce, apgd-t, fab-t, square
initial accuracy: 90.00%
apgd-ce - 1/1 - 445 out of 450 successfully perturbed
robust accuracy after APGD-CE: 1.00% (total time 226.9 s)
Traceback (most recent call last):
  File "/home/user/adversialml/src/src/attacks.py", line 104, in <module>
    adv_complete, max_nr = adversary.run_standard_evaluation(x_test, y_test, bs=args.batch_size)
  File "/home/user/adversialml/src/src/submodules/autoattack/autoattack/autoattack.py", line 172, in run_standard_evaluation
    adv_curr = self.apgd_targeted.perturb(x, y) #cheap=True
  File "/home/user/adversialml/src/src/submodules/autoattack/autoattack/autopgd_base.py", line 679, in perturb
    self.y_target = output.sort(dim=1)[1][:, -target_class]
IndexError: index -5 is out of bounds for dimension 1 with size 4
Load modules...

I think that at least 10 classes are needed and properly tested ?

fra31 commented 2 years ago

The number of target classes for the targeted versions of the attacks should be at most equal to the number of classes - 1. For example, for APGD you can specify this with https://github.com/fra31/auto-attack/blob/6482e4d6fbeeb51ae9585c41b16d50d14576aadc/autoattack/autopgd_base.py#L589

jS5t3r commented 2 years ago

Ok. When I set n_target_classes=3. Still same problem. Although I trained the model with 4 classes.

fra31 commented 2 years ago

Is it possible that it gets overwritten by this?

jS5t3r commented 2 years ago

Yes, Thanks for your help. I did changes here: https://github.com/fra31/auto-attack/blob/6482e4d6fbeeb51ae9585c41b16d50d14576aadc/autoattack/autoattack.py#L265

My changes of n_targeted_classes. I set all to 3 by 4 classes. If this is correct I am not sure, but I don't get a warning anymore.

        if version == 'standard':
            self.attacks_to_run = ['apgd-ce', 'apgd-t', 'fab-t', 'square']
            if self.norm in ['Linf', 'L2']:
                self.apgd.n_restarts = 1
                self.apgd_targeted.n_target_classes = 3 # 9
            elif self.norm in ['L1']:
                self.apgd.use_largereps = True
                self.apgd_targeted.use_largereps = True
                self.apgd.n_restarts = 5
                self.apgd_targeted.n_target_classes = 3 # 5
            self.fab.n_restarts = 1
            self.apgd_targeted.n_restarts = 1
            self.fab.n_target_classes = 3 # 9
            #self.apgd_targeted.n_target_classes = 9
            self.square.n_queries = 5000
fra31 commented 2 years ago

Yeah, it should be correct.

jS5t3r commented 2 years ago

The corresponding output:

using standard version including apgd-ce, apgd-t, fab-t, square
initial accuracy: 91.60%
apgd-ce - 1/1 - 453 out of 458 successfully perturbed
robust accuracy after APGD-CE: 1.00% (total time 110.8 s)
apgd-t - 1/1 - 2 out of 5 successfully perturbed
robust accuracy after APGD-T: 0.60% (total time 121.9 s)
fab-t - 1/1 - 0 out of 3 successfully perturbed
robust accuracy after FAB-T: 0.60% (total time 140.2 s)
square - 1/1 - 0 out of 3 successfully perturbed
robust accuracy after SQUARE: 0.60% (total time 212.3 s)
max Linf perturbation: 0.01569, nan in tensor: 0, max: 1.00000, min: 0.00000
robust accuracy: 0.60%
jS5t3r commented 2 years ago

closed.