Adversarials are equals to originals

TheFieryLynx commented 3 years ago

Hi. I run autoattack using your example autoattack/examples/eval.py:

data_dir = './data_CIFAR10'
save_dir = './results_data_CIFAR10'
norm = 'Linf'
epsilon = 0.5
log_path = './log_file.txt'
version = 'standard'
individual = 'store_true'
n_ex = 100
batch_size = 500

model = models.resnet18(pretrained=True)
model.cuda()
model.eval()

# load data
transform_list = [transforms.ToTensor()]
transform_chain = transforms.Compose(transform_list)
item = datasets.CIFAR10(root=data_dir, train=False, transform=transform_chain, download=True)
test_loader = data.DataLoader(item, batch_size=1000, shuffle=False, num_workers=0)

# create save dir
if not os.path.exists(save_dir):
    os.makedirs(save_dir)

# load attack    
from autoattack import AutoAttack
adversary = AutoAttack(model, norm=norm, eps=epsilon, log_path=log_path, version=version)

l = [x for (x, y) in test_loader]
x_test = torch.cat(l, 0)
l = [y for (x, y) in test_loader]
y_test = torch.cat(l, 0)

# example of custom version
if version == 'custom':
    adversary.attacks_to_run = ['apgd-ce', 'fab']
    adversary.apgd.n_restarts = 2
    adversary.fab.n_restarts = 2

# run attack and save images
with torch.no_grad():
    if not individual:
        adv_complete = adversary.run_standard_evaluation(x_test[:n_ex], y_test[:n_ex], bs=batch_size)
        torch.save({'adv_complete': adv_complete}, '{}.pth'.format(save_dir))

    else:
        # individual version, each attack is run on all test points
        adv_complete = adversary.run_standard_evaluation_individual(x_test[:n_ex],
            y_test[:n_ex], bs=batch_size)
        torch.save(adv_complete, '{}.pth'.format(save_dir))

But the result adversarials are equals to original inputs:

for key in adv_complete.keys():
    print(f"{key}: {np.all(adv_complete[key][0].numpy() == x_test[0].numpy())}")

>> apgd-ce: True
>> apgd-t: True
>> fab-t: True
>> square: True

I tried to use epsilon = 8./255. and epsilon = 0.5. The result was not changed(( Could you please explain me where i am wrong?

fra31 commented 3 years ago

Hi,

I tried to run the example with the same parameters as yours, and I get the perturbed images as output. Is the log fine (all attacks should give robust accuracy of 0% for epsilon = 8/255)?

TheFieryLynx commented 3 years ago

Yes, I cloned the last version of the code and set epsilon = 8./255.. Here are my logs:

Files already downloaded and verified
setting parameters for standard version
using standard version including apgd-ce, apgd-t, fab-t, square
robust accuracy by APGD-CE   0.00%   (time attack: 0.0 s)
robust accuracy by APGD-T    0.00%   (time attack: 0.0 s)
robust accuracy by FAB-T     0.00%   (time attack: 0.0 s)
robust accuracy by SQUARE    0.00%   (time attack: 0.0 s)

But I still become tensors which are equals to input. Did you use last version of code in your test?

fra31 commented 3 years ago

Yeah. I think there's some problem in the loading of your model, since the runtime is 0.0s which suggests that all images are already misclassified. Could you please check the clean accuracy of the classifier?

TheFieryLynx commented 3 years ago

Oh, sorry. I found my error. I used the model pretrained on ImageNet, but call AutoAttac with y_test from CIFAR-10. When I have fix it, everything started work fine

fra31 / auto-attack

Adversarials are equals to originals #64