Closed d12306 closed 4 years ago
Hi,
many models use different kinds of normalization. For these cases, it is sufficient to include the normalization step in the forward pass, as first operation applied to the input, so that the new model takes data in [0, 1]^d. This can be done in many ways, e.g. with a model wrapper like
class Rice2020OverfittingNet(WideResNet):
def __init__(self, depth, num_classes, widen_factor):
super(Rice2020OverfittingNet, self).__init__(depth=depth, num_classes=num_classes, widen_factor=widen_factor)
self.mu = torch.Tensor([0.4914, 0.4822, 0.4465]).float().view(3, 1, 1).cuda()
self.sigma = torch.Tensor([0.2471, 0.2435, 0.2616]).float().view(3, 1, 1).cuda()
def forward(self, x):
x = (x - self.mu) / self.sigma
return super(Rice2020OverfittingNet, self).forward(x)
The values for the normalization are from here.
Let me know if this works for you!
@fra31 , many thanks for this wrapper, clear and neat! I will close this issue! Great job!
@fra31 doesn't this class multiply the gradient of loss w.r.t image by a factor of 1 / self.sigma
which would increase the value of gradients. When I used the above class and attacked a model with FGSM I was getting lesser accuracy most probably due to the increase in gradients value
Hi,
the scale of the gradient is in general not a problem, since it's usually normalized at some point (e.g. one takes the sign of the gradient in FGSM) and sigma
is strictly positive. There are cases where the scale of the logits is problematic when applying the cross-entropy loss in PGD-based attacks (see Sec. 4 here).
Also, the normalization just rescales the input to the range which the model expect. How did you apply the attack without it?
Hi @fra31 , thank you very much for the reply, it was helpful. I realized that I made a mistake with my attacks. The mistake I did was I normalized my cifar dataset using the above mean and std and took them as inputs to the model. This would result in an input space of [-1.99, 4.10] which is bigger than the standard [0, 1] input space and I forgot to change the corresponding epsilon value of 8/255 (which is evaluated in input space [0, 1]). Anyhow, thanks again.
您好,您的邮件已收到!
I built on the solution suggested by @fra31 and simplified it to work with most torch module models, which is important for my use case since I am evaluating a range of different models
class ModelNormWrapper(torch.nn.Module):
def __init__(self, model, means, stds):
super(ModelNormWrapper, self).__init__()
self.model = model
self.means = torch.Tensor(means).float().view(3, 1, 1).cuda()
self.stds = torch.Tensor(stds).float().view(3, 1, 1).cuda()
def forward(self, x):
x = (x - self.means) / self.stds
return self.model.forward(x)
Hi, @fra31 , thanks for releasing the code for evaluating various defense method. However, I am curious about the defense in your released table (Rice et al. 2020 overfitting in adversarial robust deep learning), actually they train the model adversarially using the data normalization technique, however, directly using the currentAutoatttack code cannot reproduce the result in your table since you are assuming there is no such normalization. It will cause problems when AA generates adversarial examples.
Could you please share the code for evaluating this defense? Or did you retrain their model without normalization?
Thanks,