bethgelab / foolbox

A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
https://foolbox.jonasrauber.de
MIT License
2.77k stars 426 forks source link

CWL2 attack performance on MNIST #533

Closed jeffreyzpan closed 4 years ago

jeffreyzpan commented 4 years ago

Hi, I'm running Foolbox 3.0.0b1 with PyTorch 1.2.0 on MNIST using the below code snippet. For some reason it seems that CWL2 performs very poorly compared to FGSM. My model still is able to classify CWL2-perturbed images with ~97% accuracy. In comparison, FGSM reduces my model's accuracy to ~34%. Is there an issue with how I'm calling my attack?

Below find my code — I can post a more comprehensive snippet if necessary. Thanks!

train_loader = data.DataLoader(
    datasets.MNIST(path, True, transforms.Compose([
            transforms.RandomResizedCrop(28),
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
    ])),
    batch_size=128, shuffle=True,
    num_workers=16, pin_memory=True)

val_loader = data.DataLoader(
    datasets.MNIST(path, False, transforms.Compose([
        transforms.Resize(int(28 / 0.875)),
        transforms.CenterCrop(28),
        transforms.ToTensor(),
    ])),
    batch_size=128, shuffle=False,
    num_workers=16, pin_memory=True)

n_class = 10
criterion = nn.CrossEntropyLoss()

model = mnistCNN()
checkpoint = torch.load('model_best.pth.tar')
checkpoint['state_dict'] = {n.replace('module.', ''): v for n, v in checkpoint['state_dict'].items()}
model.load_state_dict(checkpoint['state_dict'])
model = torch.nn.DataParallel(model, device_ids=[0,1,2,3])

model.cuda()
criterion.cuda()

cudnn.benchmark = True

classifier = fb.PyTorchModel(model.eval(), (0,1))
epsilons = [16/255]

attack_list = {}
attack_list['cwl2'] = fb.attacks.L2CarliniWagnerAttack()
attack_list['fgsm'] = fb.attacks.FGSM()

for attack_name, attack in zip(attack_list.keys(), attack_list.values()):
    print('running {}'.format(attack_name))
    adv_list = []
    target_list = []
    for i, (inputs, targets) in enumerate(val_loader):
        inputs = inputs.cuda(non_blocking=True)
        target = targets.cuda(non_blocking=True)
        _, adv, success = attack(classifier, inputs, target, epsilons=epsilons)
        adv_list.append(adv[0].cpu())
        target_list.append(target.cpu())
        break # just try one batch for testing sake

    adv_examples = torch.cat(adv_list, axis=0)
    val_labels = torch.cat(target_list, axis=0)

    #convert list of adversarial images to PyTorch dataloader for validation

    adv_set = torch.utils.data.TensorDataset(adv_examples, val_labels)
    adv_loader = torch.utils.data.DataLoader(adv_set, batch_size=128, num_workers=16)
    acc, _ = validate(adv_loader, model, criterion)
    print(acc)
jeffreyzpan commented 4 years ago

Figured out the issue was just with my input epsilons, no issue with implementation.

HaoerSlayer commented 4 years ago

Could you share your hyperparameters? I meet the same question that CW seems not to work.