bethgelab / foolbox

A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
https://foolbox.jonasrauber.de
MIT License
2.77k stars 427 forks source link

PyTorchModel.gradients() returns different values depending on the batch size #439

Closed samuelemarro closed 4 years ago

samuelemarro commented 4 years ago

OS: Windows 10 Python Version: 3.7.1 Foolbox Version: 2.3.0 Torch Version: 1.4.0 (CUDA 10.1)

When I call .gradients() with a batch of size b, the returned gradient is always grad / b.

Code to reproduce:

import torchvision
import numpy as np
import foolbox

batch_size = 1

# Use a pretrained model
torch_model = torchvision.models.resnet50(pretrained=True)
torch_model.eval()

# Prepare the image and the label
image, label = foolbox.utils.imagenet_example()
image = np.moveaxis(image, 2, 0)
image = image / 255
label = np.array(label)

# Create a fake batch
images = np.repeat(image[np.newaxis], batch_size, axis=0)
labels = np.repeat(label[np.newaxis], batch_size, axis=0)

# Create a PyTorchModel
mean = np.array([0.485, 0.456, 0.406]).reshape((3, 1, 1))
stdevs = np.array([0.229, 0.224, 0.225]).reshape((3, 1, 1))
model = foolbox.models.PyTorchModel(torch_model, bounds=(0, 1), num_classes=1000,
preprocessing=(mean, stdevs))

# Compute the gradients
grads = model.gradient(images, labels)

print(grads[0].mean())

By setting batch_size = 10, the new gradients are exactly 1/10 of the original ones. The problem also appears with .forward_and_gradient().

A possible cause is the reduction used by nn.CrossEntropyLoss(): by default, CrossEntropyLoss returns the mean of the loss across the whole batch, so the loss is divided by the batch size. Using nn.CrossEntropyLoss(reduction='sum') fixes the problem, but I don't know if it's a legitimate solution or just a workaround.

jonasrauber commented 4 years ago

Does this lead to an actual problem with an attack? gradient() is not really an interface for users and instead intended for the attacks.

samuelemarro commented 4 years ago

In DeepFool, the magnitude of the perturbation is inversely proportional to the norm of the gradient difference. This means that a smaller gradient makes DeepFool seriously overshoot. For example, DeepFool with a batch size of 50-100 returns unrecognizable images on CIFAR-10.

jonasrauber commented 4 years ago

Thanks for reporting this. I have not yet looked at it in detail, but it might be possible that is a problem that was introduced with the batch support in 2.0.

jonasrauber commented 4 years ago

Thanks again, this is indeed a bug and will be fixed in the next release.

jonasrauber commented 4 years ago

Your proposed fix is correct: nn.CrossEntropyLoss(reduction='sum')

jonasrauber commented 4 years ago

released 2.4.0 with the fix