Using eval_kit and meaning of 'clean' attack

openphilanthropy / unrestricted-adversarial-examples

Contest Proposal and infrastructure for the Unrestricted Adversarial Examples Challenge

Apache License 2.0

327 stars 62 forks source link

Using eval_kit and meaning of 'clean' attack #44

Closed sibyjackgrove closed 5 years ago

sibyjackgrove commented 5 years ago

I followed the instructions for the warm-up attacks and used the API as shown below.

def my_custom_CNN_model(images_batch_nhwc):
    """ This mode is a valid defense based on a custom CNN. """
    predictions = model_regular.predict(images_batch_nhwc)
    return predictions.astype(np.float32)

# Evaluate the model (this will take ~10 hours on a GPU)
from unrestricted_advex import eval_kit
eval_kit.evaluate_bird_or_bicycle_model(my_custom_CNN_model)

Here is part of my output:

However, I am confused by the outputs. What does 'clean' attack mean? Is this the accuracy on a normal dataset (without adversarial inputs)? Is there another method to evaluate the model on the adversarial images?

carlini commented 5 years ago

Yes, clean accuracy is the accuracy on the un-modified test set. The results of the other attacks will show below the clean results, in the same table.

sibyjackgrove commented 5 years ago

Yes, clean accuracy is the accuracy on the un-modified test set. The results of the other attacks will show below the clean results, in the same table.

That doesn't seem to be the case when I use it. I see accuracy for only 'clean'. Please see the screenshot below:

nottombrown commented 5 years ago

Sorry, this was my mistake. I pushed a change that disabled the full suite of attacks. If you update to HEAD and re-run your command, you should see the complete grid rather than only the clean attack.

sibyjackgrove commented 5 years ago

Thanks, I updated and I am re-running eval_kit.evaluate_bird_or_bicycle_model(my_custom_CNN_model). Here is what I am getting. The _spatial_grid_ attack seems to be very slow taking 128 s/iteration. Is this normal or is it my model's problem. The model I am using has 1.1 M parameters.

sibyjackgrove commented 5 years ago

Now it is taking 372 s/iteration. I am using the Google Colab with GPU accelerator. Please let me know if is to be expected.

carlini commented 5 years ago

Yeah, unfortunately we need to improve that, see #46

nottombrown commented 5 years ago

Closing this issue because it is known. Perhaps we should puta larger warning that the warm-up evaluation is very slow?

sibyjackgrove commented 5 years ago

Yeah, unfortunately we need to improve that, see #46

Is there any update on this? How can I use the attack reference in #46 ?

nottombrown commented 5 years ago

@sibyjackgrove - How about we discuss this in issue #46?