Verified-Intelligence / alpha-beta-CROWN

alpha-beta-CROWN: An Efficient, Scalable and GPU Accelerated Neural Network Verifier (winner of VNN-COMP 2021, 2022, 2023, and 2024)
Other
243 stars 60 forks source link

Query on the back propagation algorithm in the original crown #11

Closed nbdyn closed 1 year ago

nbdyn commented 2 years ago

I have read the paper《Efficient neural network robustness certification with general activation functions》(proposed CROWN),I have a question: Why should back propagate to the first layer to calculate Global bounds by Λ(0) and Ω(0)?will the Global bounds obtained this way be more accurate?

huanzhang12 commented 1 year ago

The backward bound propagation is the basic procedure in CROWN. It guarantees soundness and correctness of the bounds. For "more accurate" I am not sure what you mean or what method you are comparing to. It would be more helpful if you can clarify your question.

nbdyn commented 1 year ago

Thanks for your reply. I understand that backward bound propagation is the basis of this method, my question is why do we need to backpropagate to the first layer to get Λ(0),Ω(0) and then use Theorem 3.2 to calculate fL and fU. Why can't we stop early and backpropagate to only k layers (e.g. both only to the second layer to get only Λ(1),Ω(1), without having to compute Λ(0),Ω(0) again, reducing the time overhead)? Can propagating to the first layer to get Λ(0),Ω(0) give tighter robustness bounds?

huanzhang12 commented 1 year ago

Yes, propagating to the first layer gives you the tightest bound because the input constraints specified by users are tight.

To compute fL and fU you will need the bound for the input layer if you have Λ(0),Ω(0), or intermediate layer bounds such as the bounds after the first layer if you use Λ(1),Ω(1).

For intermediate layers, although you can also use their linear coefficients such as Λ(1),Ω(1) to obtain sound bounds, they tend to be looser because intermediate layer bounds are computed using input constraints, and they are not as tight as the original input constraints.

If your aim is to produce bounds as fast as possible, you could use Λ(1),Ω(1) or other intermediate layer linear coefficients to obtain valid bounds. It's gonna be a trade-off between efficiency and bound tightness.

nbdyn commented 1 year ago

"they tend to be looser because intermediate layer bounds are computed using input constraints, and they are not as tight as the original input constraints." What does the input constraints mean here? When using Λ(1),Ω(1) they tend to be looser, do you mean that the intermediate layers' upper and lower bounds to be looser? Since the intermediate layers' upper and lower bounds are looser, then the fL and fU also to be looser too?

Besides, I have done some experiments before and considered propagating only to Λ(1),Ω(1) to make the computation time shorter. However, I was surprised to find that in some experiments I would get tighter certified lower bound.

My possible explanation was that when calculating the certified lower bound, due to the use of the binary search, propagating only to Λ(1),Ω(1) would result in a looser(compared to propagating to Λ(0),Ω(0)) upper and lower bound on f, which would lead to a higher probability of overlapping different classes in the binary search, further resulting in the equation in the picture being unsatisfied more often and therefore increasing the certified lower bound, is this reasonable?

image

However, I also note that if my explanation above is correct, then the propagation to Λ(0),Ω(0) loses its meaning. So I think there may be some mistake with my experiment or explanation, can you please help me to see what is wrong with my explanation or if such an experiment result is not possible?

Thanks a lot!

huanzhang12 commented 1 year ago

Input constraints are the Lp norm perturbation ball for robustness verification.

Intermediate layer bounds are computed using input constraints, so they are weaker than the input constraints. It is very unlikely that using intermediate layer bounds and Λ(1),Ω(1) produces tighter bounds. It is most likely a bug.

Binary search is irrelevant here. Do not use binary search. When you compare bounds, please compare at a fixed epsilon and look at the lower and upper bounds. Your bug is likely related to binary search.