huanzhang12 / CROWN-IBP

Certified defense to adversarial examples using CROWN and IBP. Also includes GPU implementation of CROWN verification algorithm (in PyTorch).
https://openreview.net/pdf?id=Skxuk1rFwB
BSD 2-Clause "Simplified" License
93 stars 14 forks source link

Results Interpretation #1

Open EmanueleLM opened 4 years ago

EmanueleLM commented 4 years ago

Hi,

When I run the CROWN verified error with let's say \epsilon = 0.3, what's the exact meaning of the terms in the output like Loss, CE loss etc.? I'd like to estimate or desume a lower bound with just CROWN on some architectures+data, is it possible with this code?

p.s. the architecture has not been trained with CROWN-IBP, it is just naturally trained (or at least with adversarial robustness).

Written in other words, given an epsilon radius, I'd like to know if that n-ball is safe using just CROWN.. is it possible to do that with your code and with a naturally trained architecture that I've built by myself?

Thank you, Best.

huanzhang12 commented 4 years ago

Sorry for the late reply. Yes, you can use the code to evaluate your model, no mater how they were trained.

When you run the code example for computing CROWN verified error, Loss and CE loss are not useful. They are used to monitor the training process. The metrics that do make sense include Err (clean error) and Rob Err (verified error). You should read the numbers in parentheses (they are the mean over the epoch, rather than a batch).

It is easy the dump CROWN bounds on some architecture + data, on any networks, not necessarily trained using CROWN-IBP. There is some commented code in train.py which shows how to call the bound API to obtain CROWN bounds: https://github.com/huanzhang12/CROWN-IBP/blob/master/train.py#L165

These comments print out lower and upper bounds for all examples in a batch. You can check if the lower bound is less than 0 to determine if an example is guaranteed to be safe or not, just like what I did for computing the verified error here.

In order to make the code read your model and data, you can follow instructions on how to train your own model, except for the last step where you run eval.py instead of train.py. Don't forget to add necessary command line arguments like "eval_params:method_params:bound_type=crown-full" to enable the full CROWN bounds (see instructions here).

Let me know if there is anything unclear or if you have any further questions.

EmanueleLM commented 4 years ago

No worries, thank you for the detailed reply, I'll try it in the next days.

carinaczhang commented 4 years ago

Sorry for the late reply. Yes, you can use the code to evaluate your model, no mater how they were trained.

When you run the code example for computing CROWN verified error, Loss and CE loss are not useful. They are used to monitor the training process. The metrics that do make sense include Err (clean error) and Rob Err (verified error). You should read the numbers in parentheses (they are the mean over the epoch, rather than a batch).

It is easy the dump CROWN bounds on some architecture + data, on any networks, not necessarily trained using CROWN-IBP. There is some commented code in train.py which shows how to call the bound API to obtain CROWN bounds: https://github.com/huanzhang12/CROWN-IBP/blob/master/train.py#L165

These comments print out lower and upper bounds for all examples in a batch. You can check if the lower bound is less than 0 to determine if an example is guaranteed to be safe or not, just like what I did for computing the verified error here.

In order to make the code read your model and data, you can follow instructions on how to train your own model, except for the last step where you run eval.py instead of train.py. Don't forget to add necessary command line arguments like "eval_params:method_params:bound_type=crown-full" to enable the full CROWN bounds (see instructions here).

Let me know if there is anything unclear or if you have any further questions.

I am just wondering why you only need to check lowerbound<0 to guarantee safety? I thought we needed to check whether the perturbation is within the boundary - But I might just be confused about the definition of verification

carinaczhang commented 4 years ago

How could one print out model parameters for each layer?

EmanueleLM commented 4 years ago

I don't remember if models allowed are from keras implementations, but in that case it's enough

[print(l.weights) for l in model.layers]