bethgelab / foolbox

A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
https://foolbox.jonasrauber.de
MIT License
2.79k stars 426 forks source link

About the pgd attacks #704

Closed caoyingnwpu closed 1 year ago

caoyingnwpu commented 1 year ago

Hello,

First of all, thank you very much for publishing this package, this is really helpful. Here I got some inconsistent results from Foolbox and my code. When I test the robustness of my model to pgd_linf attacks bounded by epsilon = 0.3 for mnist dataset, I get 89% accuracy with the following code with PyTorch:

def pgd_linf(model, x, y, epsilon, alpha = 0.01, number_iter = 40, random_restart = True):
    model.eval()
    if random_restart:
        delta = torch.zeros_like(x).uniform_(-epsilon, epsilon)
        delta.requires_grad = True
    else:
        delta = torch.zeros_like(x,requires_grad=True)
    for _ in range(number_iter):
        loss = nn.CrossEntropyLoss()(model((x + delta).clamp(0,1)), y)
        loss.backward()
        delta.data = (delta.data + (epsilon/0.3)*alpha*delta.grad.detach().sign()).clamp(-epsilon,epsilon)
        delta.grad.zero_()
    return delta.detach()

However, I got only 81% accuracy when I test the robustness of the same model to pgd_linf attacks by using the Foolbox. Here I use the following code:

 model.eval()
 fmodel = fb.PyTorchModel(model, bounds=(0,1))
 total_err = 0
 with torch.no_grad():
        for X,y in test_loader:
                X,y = X.to(device), y.to(device)
                with torch.enable_grad():
                        raw, clipped, is_adv = attack(fmodel, X, y, epsilons = 0.3)
                total_err += torch.sum(is_adv.float())
print((total_err / len(test_loader.dataset)).cpu())

actually, I have tested the robustness of model to Foolbox pgd and my code multiple times, every time I would get around 10% lower accuracy with Foolbox pgd, so I think this is not an issue about randomness. Can you please help me figure out why the difference happens? Is there anything wrong in my code or the way I use foolbox? Thanks.

zimmerrol commented 1 year ago

This seems rather like a issue in your implementation of PGD than a problem with foolbox. I sadly don't have the resources (as you can most likely tell by my super later response, sorry for that!) to help you with that.