Closed Ahmed-anwar closed 1 year ago
Thanks for sharing the details of your code. I've amended the title to reflect that you're seeing an incorrect infeasibility status when attempting to extend this code, rather than in the core library.
I've reproduced the key function of interest below. (The first two are obviously correct since we're dealing with reals). I've also removed the original comments and added comments of my own inline.
function p_relu(x::T, l_n::Real, u_n::Real, x_s::T, l_s::Real, u_s::Real)::JuMP.AffExpr where {T<:JuMPLinearType}
if u_s < l_s
Memento.warn(
MIPVerify.LOGGER,
"Inconsistent upper and lower bounds: u-l = $(u - l) is negative. Attempting to use interval arithmetic bounds instead ...",
)
u_s = upper_bound(x_s)
l_s = lower_bound(x_s)
end
if u_s < l_s
error(
MIPVerify.LOGGER,
"Inconsistent upper and lower bounds even after using only interval arithmetic: u-l = $(u - l) is negative",
)
elseif u_s <= 0
# rectified value is always 0 as indicator variable is 0
return zero(T)
elseif l_s >= 0
# rectified value is always x as indicatr variable s 1
return x
else
model = owner_model(x)
x_rect = @variable(model)
a = @variable(model, binary = true)
@constraint(model, x_s <= u_s * a)
# you appear to be missing a ')' here. perhaps a copy-and-paste error?
# otherwise I wouldn't expect your original code to compile...
@constraint(model, x_s >= l_s * (1-a)
# as far as I can tell, you're getting a (MI)QP here, rather than the MILP
# I am not certain but it is possible that I've made an assumption somewhere
# that you're working with LPs rather than QPs.
# If you'd like to keep it as an MILP, you'd have to use the trick here: https://docs.mosek.com/modeling-cookbook/mio.html#indicator-constraints
# there might also be better approaches that don't use `a` directly, e.g.
# http://www.inf.u-szeged.hu/~boglarka/AdvOpt/Modeling_tricks_2.pdf
# I haven't given this careful thought so both approaches might be equivalent
@constraint(model, x_rect == x * a)
# Manually set the bounds for x_rect so they can be used by downstream operations.
if l_n > 0
set_lower_bound(x_rect, 0)
else
set_lower_bound(x_rect, l_n)
end
if u_n > 0
set_upper_bound(x_rect, u_n)
else
set_upper_bound(x_rect, 0)
end
return x_rect
end
end
I expect that the sigmoid vs. relu piece of things is a red herring, but would be curious if you find something definitive.
Let me know what you find, and thanks for the report!
Thank you a lot for your fast response. Indeed there was a copy paste typo. I have further reformulated the quadratic constraint by removing the constraint:
@constraint(model, x_rect == x * a)
And adding the variable $d$ such that $x = x\textunderscore rect - d$ and using the following linear constraints:
d = @variable(model)
@constraint(model, x_rect - d == x)
@constraint(model, x_rect >= a * l_n)
@constraint(model, x_rect <= a * u_n)
@constraint(model, d >= (1-a) * l_n)
@constraint(model, d <= (1-a) * u_n)
However, this did not lead to a different result.
Your formulation might have an error. Substituting a = 0
:
x_rect >= 0
x_rect <= 0
x_rect == x + d >= x + l_n
x_rect == x + d <= x + u_n
Since l_n
and u_n
can both be positive, this would be an infeasible set of constraints.
Thank you again for your answer, you are right I changed this into:
M = max(abs(l_n),abs(u_n))
d = @variable(model)
@constraint(model, x_rect - d == x)
@constraint(model, x_rect >= -a * M)
@constraint(model, x_rect <= a * M)
@constraint(model, d >= -(1-a) * M)
@constraint(model, d <= (1-a) * M)
I have also noticed that while computing the bounds for the maximum operation it is also infeasible and there is the warning thrown:
Unexpected solve status INFEASIBLE; using interval_arithmetic to obtain bound
I have then tried $M = 100000000$ and the bounds were calculated without the warning being thrown yet the problem was still infeasible.
I'm not quite sure I understand your approach, but here's a set of parameters that work. It might be helpful to discuss your formulation a colleague who has more experience with integer programming (I'm a novice) so that the resulting formulation is sharper.
x_rect = x + d
When a = 0, we want
l_n <= x <= u_n
l_n+d <= x_rect <= u_n+d
Therefore, we need
l_n+d <= 0 --> d <= -l_n --> d <= (1-a)*(-l_n)
u_n+d >= 0 --> d >= -u_n --> d >= (1-a)*(-u_n)
Hello Everyone,
I am trying to use MIPVerify library for neural network verification against adversarial attacks. My goal is to replace the ReLU function by a trainable mask that depends on the input image. Therefore for layer $i$, the activation of this layer instead of $max(0, w_i^\top \cdot xi)$ is calculated by $\mathbb{I}{(w_m^\top \cdot x_0) > 0} \cdot (w_i^\top \cdot x_i)$ where $w_m$ is the weight of independent network $m$ that is trained to predict the behavior of the ReLU function at layer $i$ and $x_0$ is the original input.
I followed the same approach of the code in implementing the ReLU function and I only changed the base case and the constraints. Furthermore, I used two different methods to train the masking network, one using ReLU activation and one with Sigmoid:
However, the models trained using ReLU activation are always INFEASIBLE when I try to find adversarial examples using $L_\infty$ norm even with large $\varepsilon$ values such as $0.5$ or even $0.9$. The models trained with sigmoid activation do not exhibit the same behavior. And using PGD I am able to find adversarial examples. I would like to know if my formulation of the constraints is correct regarding my goal and what would be the reason for this behavior.