eth-sri / eran

ETH Robustness Analyzer for Deep Neural Networks
Apache License 2.0
323 stars 103 forks source link

Reproduce the refinezono result #77

Closed jiahaubai closed 3 years ago

jiahaubai commented 3 years ago

Hi,

I want to reproduce the refinezono result from the paper Boosting Robustness Certification of Neural Networks

I want the result of the yellow block

image

and I use the command python3 . --netname ../net/mnist_relu_6_100.tf --epsilon 0.02 --domain refinezono --dataset mnist

after executing , the result is

...
img 99 Verified 9
progress: 100/None, correct:  99/100, verified: 31/99, unsafe: 0/99,  time: 4.040; 4.671; 462.398
analysis precision  31 /  99

But in paper, it shows 67 % on 6X100 model Is there a problem with my command ?? or any ideas to fix it, thanks!

mnmueller commented 3 years ago

Hello @jiahaubai,

Thank you for your interest in ERAN.

First, I want to recommend comparing to our much more recent results using PRIMA, which is not only faster than RefineZono, but also much more accurate (see Figure 10 in the paper linked above) and is the current state-of-the-art on the 6X100 network (called 5x100 in this network due to its number of hidden layers).

In any case, your command is indeed missing the neuron-wise bound-refinement described in the paper (also leading to the much lower average runtime of just 4.6s instead of 194s). To activate this refinement, please run ERAN as follows:

python3 . --netname ../nets/mnist/mnist_relu_6_100.tf --dataset mnist --domain refinezono --epsilon 0.02 --num_test 100 --k 1 --refine_neurons  --n_milp_refine 2  --timeout_milp 10 --timeout_lp 10

Here "refine_neurons" generally activates the LP refinement of neuron-wise bounds and "n_milp_refine" determines the number of layers for which to use MILP instead of LP. Please note, that the exact results you will obtain depend on the available hardware, as a faster machine will be able to obtain tighter neuron-wise bounds before the corresponding timeouts are reached.

For the above settings and with my machine I get the following result:

progress: 100/100, correct:  99/100, verified: 70/99, unsafe: 0/99,  time: 120.571; 293.167; 29023.496

Cheers, Mark

jiahaubai commented 3 years ago

Hi Mark,

Thank you for your reply ! After running the command you provide, I get the result

progress: 100/100, correct: 99/100, verified: 66/99, unsafe: 0/99, time: 182.996; 389.625; 38572.918 analysis precision 66 / 99

[Q1] I guess it may be due to the hardware problem, so the result is not as good as yours. sorry, I am still confusing about the option "k" and "n_milp_refine", if I adjust them lager, would the final precision be improved ? (it looks like "--timeout_milp" is one of the factor influences the final precision, does "k" and "n_milp_refine" be too ?

could you give me an example on the below network, please ? if I choose --k 1 --refine_neurons --n_milp_refine 2, how they operate on it ?

[Q2] I have another problem when executing the command with my model (which has maxpool, the architecture is as same as in Q1 example), and I get this error. Any ideas to fix it ?? Thanks!!

I run the command python3 . --netname ../net/MNIST_small.onnx --dataset mnist --domain refinezono --epsilon 0.02 --num_test 100 --k 1 --refine_neurons --timeout_milp 10 --timeout_lp 10 --mean 0.5 0.5 0.5 --std 0.5 0.5 0.5

and the error is

File "./deepzono_nodes.py", line 827, in transformer element = pool_zono(man, True, element, (c_size_t * 3)(h,w,1), (c_size_t * 3)(H, W, C), 0, (c_size_t * 2)(self.stride[0], self.stride[1]), 3, offset+old_length, self.pad_top, self.pad_left, self.pad_bottom, self.pad_right, self.output_shape, self.is_maxpool) TypeError: pool_zono() takes 13 positional arguments but 15 were given

-- I would like to add

command in below works fine on the old ERAN version which I cloned on the MAR 22 But, it get the same Error (as above) when using the more recent ERAN version (cloned in last week) python3 . --netname ../net/MNIST_small.onnx --dataset mnist --domain deepzono --epsilon 0.02 --mean 0.5 0.5 0.5 --std 0.5 0.5 0.5

Thanks, jiahaubai

mnmueller commented 3 years ago

Hi @jiahaubai,

Your result does indeed look like you would have to increase the timeouts a little bit to compensate for differences in hardware to reproduce the results we reported.

I looked into Q2 and that should be resolved if you pull an updated version of ELINA and recompile.

Regardingg Q1: k defines the group size for multi-neuron constraints as per our recent paper. Increasing k will increase the analysis precision, but is not recommended to be used with the zonotope domain (use refinepoly instead, which is also generally more precise). Both n_milp_refine and refine_neurons are designed for fully connected networks and will have no effect on the network you posted above.

Further, I would recommend switching the order of the MaxPool and ReLU layer, as that will decrease the number of error terms as the ReLU layer will be applied to fewer neurons. Under certain conditions, this can also increase analysis precision.

Cheers, Mark

jiahaubai commented 3 years ago

Hi Mark,

Thank you for your suggestion on Q2 and kind instruction on Q1 ! I am glad that after updating ELINA, Q2 has been solved smoothly.

In addition, I want to try using refinpoly. And I want to confirm if the command is correct

python3 . --netname ../net/MNIST_small.onnx --dataset mnist --domain refinepoly --epsilon 0.00 --num_test 100 --timeout_milp 10 --timeout_lp 10 --mean 0.5 0.5 0.5 --std 0.5 0.5 0.5

I hope refinedpoly can be used correctly on my onnx model

Thank you very much for your team for providing such a great tool to help me verify different models. It is very flexible for the user. I really appreciate it !

Thanks, jiahaubai

mnmueller commented 3 years ago

Hello @jiahaubai,

Both timeout_milp and timeout_lp are used for neuron-wise refinement, which is not active with your settings (and not applicable for the network you posted above). Therefore, you can leave those values at the default. To set the timeout for the robustness optimization problem use timeout_final_lp and timeout_final_milp. With your current settings, only an LP problem will be considered, but you can set --partial_milp 1 to encode your last ReLU layer with MILP constraints. Also, be aware, that you are using the default PRIMA parameters, defined in your config.py file.

Cheers, Mark