fra31 / auto-attack

Code relative to "Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks"
https://arxiv.org/abs/2003.01690
MIT License
639 stars 111 forks source link

max Linf perturbation is larger than the epsilon #86

Closed XiaoYaoYouUSTC closed 1 year ago

XiaoYaoYouUSTC commented 2 years ago

Hi, @fra31 , really nice job! Thanks for releasing the code for adversarial attack. Recently, I'm using your code to generate adversarial examples and the code is below.

transform_train = transforms.Compose([
        transforms.RandomCrop(32, padding=4),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
    ])
transform_test = transforms.Compose([
        transforms.ToTensor(),
        transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
    ])
training_data=datasets.CIFAR100(root="data",train=True,download=True,transform=transform_train)
batch_size=1
device = "cuda" if torch.cuda.is_available() else "cpu"
model=resnet4cifar100.resnet(num_classes=100,depth=50,block_name="BasicBlock").to(device)
model.load_state_dict(torch.load("XXXX"))
shuffle_train_dataloader=DataLoader(training_data,batch_size=1,shuffle=True)

adversary=AutoAttack(model,norm='Linf',eps=1,version='standard',device='cuda',verbose=True)
for X,y in shuffle_train_dataloader:
    x_adv,y_adv=adversary.run_standard_evaluation(X.to(device),y.to(device),bs=1,return_labels=True)
    print(np.max(np.abs((x_adv.to('cpu')-X).numpy())))
    break

As you can see, when I using AutoAttack,I set the epsilon to 1 and set the norm to Linf. However, I get the result like this and the max Linf perturbation is 2.42907, larger than the epsilon I set. Counld you please give me some suggestions? Is this because I normalize the dataset and the input space is not [0,1]?

setting parameters for standard version
using standard version including apgd-ce, apgd-t, fab-t, square
initial accuracy: 100.00%
apgd-ce - 1/1 - 1 out of 1 successfully perturbed
robust accuracy after APGD-CE: 0.00% (total time 2.5 s)
max Linf perturbation: 2.42907, nan in tensor: 0, max: 1.00000, min: 0.00000
robust accuracy: 0.00%
2.4290657
ScarlettChan commented 2 years ago

您好,您的邮件已收到!

fra31 commented 2 years ago

Hi,

I think the issue comes from the fact that the images are normalized when loaded. The attacks expect input in [0, 1], and any normalization should be included in the forward pass (see also https://github.com/fra31/auto-attack/issues/13).

Hope this helps!

ScarlettChan commented 1 year ago

您好,您的邮件已收到!