Share the evaluation code for "overfitting in adversarial robust deep learning" as in your paper

fra31 / auto-attack

Code relative to "Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks"

https://arxiv.org/abs/2003.01690

MIT License

656 stars 112 forks source link

Share the evaluation code for "overfitting in adversarial robust deep learning" as in your paper #13

Closed d12306 closed 4 years ago

d12306 commented 4 years ago

Hi, @fra31 , thanks for releasing the code for evaluating various defense method. However, I am curious about the defense in your released table (Rice et al. 2020 overfitting in adversarial robust deep learning), actually they train the model adversarially using the data normalization technique, however, directly using the currentAutoatttack code cannot reproduce the result in your table since you are assuming there is no such normalization. It will cause problems when AA generates adversarial examples.

Could you please share the code for evaluating this defense? Or did you retrain their model without normalization?

Thanks,

fra31 commented 4 years ago

Hi,

many models use different kinds of normalization. For these cases, it is sufficient to include the normalization step in the forward pass, as first operation applied to the input, so that the new model takes data in [0, 1]^d. This can be done in many ways, e.g. with a model wrapper like

class Rice2020OverfittingNet(WideResNet):
    def __init__(self, depth, num_classes, widen_factor):
        super(Rice2020OverfittingNet, self).__init__(depth=depth, num_classes=num_classes, widen_factor=widen_factor)
        self.mu = torch.Tensor([0.4914, 0.4822, 0.4465]).float().view(3, 1, 1).cuda()
        self.sigma = torch.Tensor([0.2471, 0.2435, 0.2616]).float().view(3, 1, 1).cuda()

    def forward(self, x):
        x = (x - self.mu) / self.sigma
        return super(Rice2020OverfittingNet, self).forward(x)

The values for the normalization are from here.

Let me know if this works for you!

d12306 commented 4 years ago

@fra31 , many thanks for this wrapper, clear and neat! I will close this issue! Great job!

Hrushikesh-github commented 3 years ago

@fra31 doesn't this class multiply the gradient of loss w.r.t image by a factor of 1 / self.sigma which would increase the value of gradients. When I used the above class and attacked a model with FGSM I was getting lesser accuracy most probably due to the increase in gradients value

fra31 commented 3 years ago

Hi,

the scale of the gradient is in general not a problem, since it's usually normalized at some point (e.g. one takes the sign of the gradient in FGSM) and sigma is strictly positive. There are cases where the scale of the logits is problematic when applying the cross-entropy loss in PGD-based attacks (see Sec. 4 here).

Also, the normalization just rescales the input to the range which the model expect. How did you apply the attack without it?

Hrushikesh-github commented 3 years ago

Hi @fra31 , thank you very much for the reply, it was helpful. I realized that I made a mistake with my attacks. The mistake I did was I normalized my cifar dataset using the above mean and std and took them as inputs to the model. This would result in an input space of [-1.99, 4.10] which is bigger than the standard [0, 1] input space and I forgot to change the corresponding epsilon value of 8/255 (which is evaluated in input space [0, 1]). Anyhow, thanks again.

ScarlettChan commented 2 years ago

您好，您的邮件已收到!

rohit-gupta commented 2 years ago

I built on the solution suggested by @fra31 and simplified it to work with most torch module models, which is important for my use case since I am evaluating a range of different models

class ModelNormWrapper(torch.nn.Module):
    def __init__(self, model, means, stds):
        super(ModelNormWrapper, self).__init__()
        self.model = model
        self.means = torch.Tensor(means).float().view(3, 1, 1).cuda()
        self.stds = torch.Tensor(stds).float().view(3, 1, 1).cuda()

    def forward(self, x):
        x = (x - self.means) / self.stds
        return self.model.forward(x)