bethgelab / foolbox

A Python toolbox to create adversarial examples that fool neural networks in PyTorch, TensorFlow, and JAX
https://foolbox.jonasrauber.de
MIT License
2.76k stars 426 forks source link

Boundary Attack Initialization Failure #541

Closed pratyushmaini closed 2 years ago

pratyushmaini commented 4 years ago

I am using the Boundary Attack, and get the following error indicating an initialization issue:

File ".../python3.6/site-packages/foolbox/attacks/blended_noise.py:83: UserWarning: LinearSearchBlendedUniformNoiseAttack failed to draw sufficient random inputs that are adversarial (947 / 1000). f"{self.class.name} failed to draw sufficient random" Traceback (most recent call last): advs, clipped, is_adv = attack(fmodel, images, criterion, epsilons=epsilons_dict[types]) File ".../python3.6/site-packages/foolbox/attacks/base.py", line 410, in call xp = self.run(model, x, criterion, early_stop=early_stop, **kwargs) File ".../python3.6/site-packages/foolbox/attacks/boundary_attack.py", line 129, in run f"init_attack failed for {failed} of {len(is_adv)} inputs" ValueError: init_attack failed for 46.0 of 1000 inputs

Is there a way to circumvent the error, and only get adversarial images for the correctly initialized images?

averyma commented 4 years ago

Similar issue.

zjysteven commented 4 years ago

Same here. I actually encountered this when trying to attack an adversarially trained model. Any suggestions? Thanks! @jonasrauber @wielandbrendel

shoaibahmed commented 4 years ago

Decision-based attacks by default use a specific way to mine initial points which are adversarial to start the search. The attack takes in an attack_init argument which specifies the method to use for initial adversarial example mining (by default, it is set to LinearSearchBlendedUniformNoiseAttack). So in your case, this method was unable to find adversarial examples for all the points which is very likely in cases with a small number of classes. Therefore, a simple strategy is to choose data points from the dataset which are predicted as belonging to another class and use them as starting points to the method.

Just as an example, you can do something very simple like this:

cls_samples = {}
for cls in range(num_labels):
    idx = clean_pred != cls
    cls_samples[cls] = X[idx][0]  # Just pick the first example

attack_fn = BoundaryAttack()

for current_X, current_Y in batches:
    starting_points = []
    for y in current_Y:
        starting_points.append(cls_samples[int(y)])
    starting_points = torch.stack(starting_points, dim=0).to(device)
    advs, _, success = attack_fn(fmodel, current_X, current_Y, starting_points=starting_points, epsilons=epsilon_list)
zdhNarsil commented 4 years ago

Same problem. Seems that LinearSearchBlendedUniformNoiseAttack is problematic. @wielandbrendel would you see to this problem soon, please?

zdhNarsil commented 4 years ago

Decision-based attacks by default use a specific way to mine initial points which are adversarial to start the search. The attack takes in an attack_init argument which specifies the method to use for initial adversarial example mining (by default, it is set to LinearSearchBlendedUniformNoiseAttack). So in your case, this method was unable to find adversarial examples for all the points which is very likely in cases with a small number of classes. Therefore, a simple strategy is to choose data points from the dataset which are predicted as belonging to another class and use them as starting points to the method.

Just as an example, you can do something very simple like this:

cls_samples = {}
for cls in range(num_labels):
    idx = clean_pred != cls
    cls_samples[cls] = X[idx][0]  # Just pick the first example

attack_fn = BoundaryAttack()

for current_X, current_Y in batches:
    starting_points = []
    for y in current_Y:
        starting_points.append(cls_samples[int(y)])
    starting_points = torch.stack(starting_points, dim=0).to(device)
    advs, _, success = attack_fn(fmodel, current_X, current_Y, starting_points=starting_points, epsilons=epsilon_list)

@shoaibahmed This seems reasonable. But I tried your method but still the same error report, it's really weird!

danushv07 commented 3 years ago

Even I encountered this error. As the error message indicates, the init_attack, in this case LinearSearchBlendedUniformNoiseAttack, failed to generate adversarial samples for the dataset, which seems to be a mandatory condition for the Boundary Attack. One method which I used to bypass this error, was to first attack the model using LinearSearchBlendedUniformNoiseAttack and make sure all the generated samples are explicitly adversaries. If not then change the hyperparameters such as distance, direction and steps to generate the adversaries. Use these samples as the starting_point in Boundary Attack. However, changing the hyperparameters can increase the time complexity for the entire attack.

fotinidelig commented 2 years ago

Simple solution: I implemented a simple fix that temporarily removes the points where init_attack failed and computes the attack for the rest. It then returns the attack result for those points, and for the failed ones it returns the original point.

Considering that you should always check if an attack is successful as a last step I thought this seemed reasonable.

You can check it out here.

I would use either @shoaibahmed solution or something like the above. Maybe he can inform us whether this is an idea for a pull request?