Only one pixel is perturbed in each iteration?

machanic commented 3 years ago

I read your paper , and your code. It seems that only one pixel is perturbed in each iteration, am I correct? https://github.com/cg563/simple-blackbox-attack/blob/master/simba_single.py#L19 After reading your paper, I thought it is a random perturbation over the entire image. But the code is per-pixel perturbation algorithm, I think it is inefficient to attack.

cg563 commented 3 years ago

Yes you're right, only a single pixel (and a single color channel within the pixel) is perturbed per iteration.

This is part of the surprise with this method -- you don't need to search over the whole image to find where to perturb, but almost any pixel can reduce the model's confidence a little bit at a time. There is also the DCT version where we perturb along random basis vectors in a low-dimensional subspace. Hope this helps.

machanic commented 3 years ago

Does DCT version also perturb one pixel in each iteration?

cg563 commented 3 years ago

DCT spreads the perturbation across the image. However, I would like to emphasize that there's no fundamental difference between changing a single pixel by epsilon versus perturbing along some random direction with norm epsilon. From our experimental observation, both perform just as well.

machanic commented 3 years ago

@cg563 1. I check the code in https://github.com/cg563/simple-blackbox-attack/blob/3f9f83224262c6dee95ffd7ef4787e36d604e580/run_simba.py#L110, It shows that DCT also changes one single pixel in each iteration , rather than the perturbation across the image? Where is the code of DCT perturb the entire pixels in one image?

Because each pixel is perturbed by 0.2 , how to ensure the whole image's budget of L2 norm is less than the epsilon (For example, L2 norm epsilon = 1.0 in L2 norm attack)?

machanic commented 3 years ago

What does DCT mean? Can you explain this a bit more? Thanks!

cg563 commented 3 years ago

DCT stands for discrete cosine transform, which is one of the ways to decompose an image into a set of wave functions. https://en.wikipedia.org/wiki/Discrete_cosine_transform has a good explanation for the transformation. The idea in the DCT attack is that you change the magnitude of a wave function by +/-epsilon rather than changing each pixel by +/-epsilon. You can also refer to https://arxiv.org/pdf/1809.08758.pdf for how it applies to adversarial examples.

In https://github.com/cg563/simple-blackbox-attack/blob/master/run_simba.py#L84, the function trans applies the inverse DCT transformation, which is called in https://github.com/cg563/simple-blackbox-attack/blob/master/run_simba.py#L114 and https://github.com/cg563/simple-blackbox-attack/blob/master/run_simba.py#L127 to produce low frequency perturbations.
We do not set an upper bound for the L2 norm during the attack. You can enforce it manually by stopping the attack once the perturbation norm reaches your upper bound. Since the perturbation directions are all orthogonal, you can actually derive the L2 norm = sqrt(# number pixels changed) * epsilon.

cg563 / simple-blackbox-attack

Only one pixel is perturbed in each iteration? #12