cg563 / simple-blackbox-attack

Code for ICML 2019 paper "Simple Black-box Adversarial Attacks"
MIT License
191 stars 56 forks source link

The query count question? #7

Closed machanic closed 3 years ago

machanic commented 4 years ago

In code of run_simba.py, the queries_k represents the query count .

In https://github.com/cg563/simple-blackbox-attack/blob/master/run_simba.py#L127, one query is applied on the target model, but the queries_k does not increase, why? And why there are two increases in https://github.com/cg563/simple-blackbox-attack/blob/master/run_simba.py#L117 and https://github.com/cg563/simple-blackbox-attack/blob/master/run_simba.py#L124?

machanic commented 4 years ago

Can you explain why queries = torch.zeros(batch_size, max_iters) and queries[:, k:] = torch.zeros(args.batch_size, max_iters - k) ? Why is that shape?

cg563 commented 4 years ago

The query count increase in https://github.com/cg563/simple-blackbox-attack/blob/master/run_simba.py#L124 is for the query in https://github.com/cg563/simple-blackbox-attack/blob/master/run_simba.py#L127 for the positive direction, but was done prior to the query was made. Sorry for the confusion!

I think you're referring to the code block below for the second question:

if remaining.sum() == 0:
            adv = (images_batch + trans(expand_vector(x, expand_dims))).clamp(0, 1)
            probs_k = get_probs(model, adv, labels_batch)
            probs[:, k:] = probs_k.unsqueeze(1).repeat(1, max_iters - k)
            succs[:, k:] = torch.ones(args.batch_size, max_iters - k)
            queries[:, k:] = torch.zeros(args.batch_size, max_iters - k)
            break

This part first checks if all images have been misclassified, in which case it fills in the success rate and query count vectors before terminating. Since the algorithm was at iteration k before terminating, the remaining zeros that should be filled in has size torch.zeros(args.batch_size, max_iters - k). Hope this helps.

machanic commented 4 years ago

@cg563 I am still confused. In my experiments, the SimBA attack's query count is much higher than Bandits attack. 1. The 2-D shape of queries = torch.zeros(batch_size, max_iters)? Do you mean each iteration has its own query count. In other words, the queries are counted in each iteration separately? 2. https://github.com/cg563/simple-blackbox-attack/blob/master/run_simba.py#L91 Why the query of Line 91 does not counted into queries_k, I think there is a missing query count? 3. How to bound L_2 or L_inf norm attack's epsilon? I modified the code of function trans to add the L_p bound as follows:

    def trans(self, z, image_size):
        if self.pixel_attack:
            perturbation =  z.cuda()
        else:
            perturbation =  block_idct(z, block_size=image_size).cuda()
        if self.norm == "l2":
            assert perturbation.dim() == 4
            norm_ = torch.clamp(torch.norm(perturbation),min=1e-12).item()
            factor = min(1, self.epsilon / norm_)
            perturbation = perturbation * factor
        elif self.norm == "linf":
            perturbation = torch.clamp(perturbation, -self.epsilon, self.epsilon)
        return perturbation

However, what does the Line 109's epsilon mean? (https://github.com/cg563/simple-blackbox-attack/blob/master/run_simba.py#L109 ) Does this line conflict to my modification? And does my trans function modify correctly ? 4. In my experiments, the query of CIFAR-10 dataset is much higher than that of Bandits attack! Why? Because CIFAR-10 is 32x32 resolution image. My SimBA parameters are set as follows:

freq_dims: 11
stride: 7
epsilon: 4.6
pixel_attack: False
order: strided
num_iters: 10000

Does these parametes set correctly in 32x32x3 images?

cg563 commented 4 years ago
  1. Yes, the queries vector stores the count for each iteration, so the total query count is the sum over all iterations (second dimension).

  2. The query in Line 91 should or should not be counted depending on the definition of a black-box query. If the model returns both the top class and target class's probability in one query, then you can technically get the information contained in Line 91 from a previous iteration. However, if the model only returns the target class's probability, then you have no way of knowing whether or not you have succeeded (and hence should early stop). In the code we're assuming the former case, and for simplicity of implementation chose not to fold that call into a previous iteration's model query in Lines 114 and 127.

  3. SimBA naturally should be thought of as an L_2 attack, since that's the norm that the attack can guarantee regardless of the orthonormal basis (pixel or DCT). There shouldn't be a need to bound the L_2 norm, which probably also partly answers question 4. We never tried bounding the L_inf norm -- this certainly wouldn't work for SimBA in pixel space since epsilon is the per-pixel change in this case. It is possible that restricting the L_inf norm will work for DCT basis though.

  4. The setting we used for CIFAR-10 is: freq_dims = 14, stride = 7, epsilon = 0.2, pixel_attack = False, order = strided, num_iters = 0. The number of iterations is set to unlimited since most of the time the attack will succeed very early on, so there's no point restriction that.

machanic commented 4 years ago

@cg563 About question 3 & 4, in other papers, the epsilon of L2 and L_inf norm attack is set to 5.0 or 4.6, You wrote that SimBA does not need to bound the epsilon of L2 norm attack? But what does Line 109's epsilon mean? And what does epsilon = 0.2 mean in you Q4 answer? How can I compare to other L2 norm attack that bounded with epsilon=4.6?

cg563 commented 4 years ago

The epsilon = 0.2 set in the code refers to each pixel changing by at most +/-0.2. The perturbation L_2 norm is given by sqrt(number of pixels that changed) * epsilon.

One way to enforce a strict L_2 norm constraint is to run the attack for as long as possible, and discarding any adversarial image as failed run if its L_2 norm is too high. This should be rare, since the average L_2 norm is only 3.06 for untargeted SimBA-DCT (Table 1 in the paper).

machanic commented 4 years ago

@cg563 Thank you! Do you mean I don't need to modify your trans function's code as in my Q3? And I don't need to modify any part of your code, the only I can do is to run the attack for as long as possible to get a lower L_2 norm bound?

I will read your paper Table 1 again to figure out how to bound L2 norm epsilon and compare with other method

cg563 commented 4 years ago

Right, the attack tries to minimize the L_2 perturbation to be as small as possible. You can consider a run as failed if it doesn't get below the perturbation norm you expected. The paragraph under "Budget considerations" in section 3 also explains how the implicit bound on L_2 norm is derived.

machanic commented 4 years ago

@cg563 Do you mean SimBA attack does not support the L_inf norm attack? It further means I cannot compare it with other L_inf norm attack methods in my paper? I can only compare SimBA with other state-of-the-art L2 norm attacks.

cg563 commented 4 years ago

SimBA is not designed as an optimal L_inf norm attack. However, it is still possible to restrict the perturbation L_inf norm manually and the attack still succeeds very often.

I added an option in the code to restrict the L_inf norm. You can run it with the following arguments:

python run_simba.py --data_root <imagenet_root> --num_iters 10000 --freq_dims 224 --linf_bound 0.05

This will run the attack in DCT space with full spectrum at L_inf norm of 0.05. Please keep in mind that the L_inf norm version should only be run in DCT mode (i.e. without the --pixel_attack flag).

machanic commented 4 years ago

@cg563 I tried your newest code, but after iterations, the perturbation of pixels in generated adversarial image is larger than linf_bound=0.05, which is 0.2. So, should I change the --epsilon 0.2 to --epsilon 0.05 (set to linf_bound) in the L_inf norm attack?

machanic commented 4 years ago

@cg563 I use the Line 158's first returned variable x as the perturbation, I notice that x does not transformed by self.trans(...) method, which means the block_idct is not called to produce x. Thus x is larger than linf_bound. (https://github.com/cg563/simple-blackbox-attack/blob/master/run_simba.py#L158) How to fix this? Is x the final adversarial perturbation?

cg563 commented 4 years ago

Sorry, this is a mistake on my part. The return variable x is the perturbation only for the pixel space attack. The actual adversarial image is the variable expanded in https://github.com/cg563/simple-blackbox-attack/blob/master/run_simba.py#L151.

To make it easier to load the adversarial images, I changed the save format to directly output the perturbed images in both cases. Let me know if this resolves the problem.