ryderling / DEEPSEC

DEEPSEC: A Uniform Platform for Security Analysis of Deep Learning Model
MIT License
205 stars 71 forks source link

Discrepancies between tables, text, and code #12

Open carlini opened 5 years ago

carlini commented 5 years ago

Table XIII states that on CIFAR-10 the R+FGSM attack was executed with eps=0.05 and alpha=0.05 whereas the README in the Attack module of the open source code gives eps=0.1 and alpha=0.5. I assume the code is correct and the table is wrong. Table XIII states that the “box” constraint for CWL2 is set to -0.5, 0.5 but in the code the (correct) values of 0.0, 1.0 are used.

Other hyperparameters are completely missing (e.g., Table XIII does not give the number of iterations used for any of the gradient-based attacks). This is especially confusing when the default values differ from the original attack implementations; for example, this code sets the number of binary search steps for CW2 to 5 (and does not state this in the paper) whereas the original code uses the value 10; fortunately, this setting often has only a minimal impact on accuracy.

ryderling commented 5 years ago

Table XIII states that on CIFAR-10 the R+FGSM attack was executed with eps=0.05 and alpha=0.05 whereas the README in the Attack module of the open source code gives eps=0.1 and alpha=0.5. I assume the code is correct and the table is wrong. Table XIII states that the “box” constraint for CWL2 is set to -0.5, 0.5 but in the code the (correct) values of 0.0, 1.0 are used.

Other hyperparameters are completely missing (e.g., Table XIII does not give the number of iterations used for any of the gradient-based attacks). This is especially confusing when the default values differ from the original attack implementations; for example, this code sets the number of binary search steps for CW2 to 5 (and does not state this in the paper) whereas the original code uses the value 10; fortunately, this setting often has only a minimal impact on accuracy.

As all codes have been re-constructed for better readability and consistency after we finished the paper, some discrepancies do exists, but most of them do not affect the final result. For example, for the epsilon and alpha, both of code and table in Table XIII are correct because these two notations are not exactly matched but have the same results (‘eps’=’alpha’=0.05 in the paper is the absolute value of epsilon distortion, while in codes eps=0.1 is the total value of epsilon distortion, and alpha=0.5 is the ratio of random, so the absolute value of random alpha epsilon is 0.1*0.5=0.05.). In order to be consistent with the box constraint in other attacks (FGSM, BIM, etc), we make the CWL2 constraint same with others.

In general, we detailed hyperparameters that have a great impact on the attack or defense instead of all the parameters. As for the binary search step value, we suggest you to carefully check the default value in CW2_Generation.py
_https://github.com/kleincup/DEEPSEC/blob/master/Attacks/CW2_Generation.py#L106_ It is 10 binary search steps, not 5 as you mentioned.

carlini commented 5 years ago

If you read the original paper that proposes R+FGSM it defines alpha as the initial step size that's taken randomly, and then (epsilon-alpha) as the gradient step size. So clearly according to this definition the table numbers are incorrect: otherwise the gradient step taken would be wrong. To make consistent notation I would suggest you change this.

I don't understand what you mean by the box constraint is [-0.5, 0.5] on CW2. Setting this to the box constraint would clip actual images that you have which are in [0,1] to a maximum value of solid grey and allow the attack to introduce values of -0.5 which isn't within the typical data.

Again, this is a minor issue compared to the others.

ryderling commented 5 years ago

If you read the original paper that proposes R+FGSM it defines alpha as the initial step size that's taken randomly, and then (epsilon-alpha) as the gradient step size. So clearly according to this definition the table numbers are incorrect: otherwise the gradient step taken would be wrong. To make consistent notation I would suggest you change this.

I don't understand what you mean by the box constraint is [-0.5, 0.5] on CW2. Setting this to the box constraint would clip actual images that you have which are in [0,1] to a maximum value of solid grey and allow the attack to introduce values of -0.5 which isn't within the typical data.

Again, this is a minor issue compared to the others.

The table XIII numbers are correct. Let me make more clearly. For CIFAR10, the total budget distortion epsilon = 0.1. If the ratio of alpha to epsilon = 0.5, then the absolute value of alpha that will take randomly = 0.1 * 0.5 = 0.05, the remain eps that will not take randomly = 0.1 - 0.05 = 0.05. Therefore, it is correct Table XIII states that on CIFAR-10 the R+FGSM attack was executed with eps=0.05 and alpha=0.05. As for the codes, the 'alpha' is actually the ratio of the alpha to the budget epsilon as we comment in the codes. More details you can read from here. https://github.com/kleincup/DEEPSEC/blob/master/Attacks/RFGSM_Generation.py#L92 https://github.com/kleincup/DEEPSEC/blob/master/Attacks/AttackMethods/RFGSM.py#L48

For the difference of box constraints, we set box constraint to [-0.5, 0.5] because we used to normalize the image into [-0.5, 0.5] when we run experiments and write the paper. After we re-constructed our code, we normalize the image into [0, 1], then the box constraint goes to [0, 1].

carlini commented 5 years ago

Because the DeepSec paper doesn't give a definition of R+FGSM and cites Tramer et al., the only way to interpret what epsilon and alpha mean is by referring to their original paper. Equation 7 in Ensemble Adversarial Training defines

x^{adv} = x' + (\epsilon - \alpha) * sign(\nabla{x'}J(x', y{true})

According to this definition, \epsilon should be equal to 0.1, and \alpha should be equal to 0.05, assuming you would like to first step 0.05 randomly and then step 0.05 again.

ftramer commented 5 years ago

I agree with Nicholas that this discrepancy in notation for R+FGSM is very confusing. Here's my take:

My suggestion for fixing this would be the following:

ryderling commented 5 years ago

Thanks for your suggestion, I will update the parameter 'alpha' as the 'alpha_ratio'.

ryderling commented 5 years ago

Already updated the parameter 'alpha' as the 'alpha_ratio' in https://github.com/kleincup/DEEPSEC/commit/2c67afac0ae966767b6712a51db85f04f4f5c565.