Open pulsipher opened 1 month ago
Alternatively, I suppose I could overload every add_predictor method for InfiniteModels, but then I would end up having to copy nearly all the code (only tweaking how the variables are generated).
It's unfortunate. But this would be my suggestion.
Could you give an example of what you would require for the Affine
and ReLU
layers?
I am very reticent to start making things abstract before we absolutely need to.
I think this should work now if you do y = add_predictor(model, NN, x; gray_box = true, reduced_space = true)
?
I think this should work now if you do
y = add_predictor(model, NN, x; gray_box = true, reduced_space = true)
?
It should work with reduced_space = true
regardless of whether gray_box
is used. The issue is getting it to work with full space formulations (especially ones that don't have reduced space alternatives), namely all the different ReLU reformulations such as big-M.
It's unfortunate. But this would be my suggestion.
This would indeed be unfortunate since any changes made to any of the predictors would have to be made in two places which has been quite annoying on the OMLT team.
Could you give an example of what you would require for the
Affine
andReLU
layers?
For activation functions, I think only single extension point would be needed for variable generation:
function add_activation_variable(model::JuMP.AbstractModel, xi, lb, ub; kwargs...)
yi = JuMP.@variable(model; kwargs...)
_set_bounds_if_finite(yi, lb, ub)
return yi
end
In the InfiniteOpt extension, I would add something of the form:
function add_activation_variable(model::InfiniteOpt.InfiniteModel, xi, lb, ub; kwargs...)
prefs = InfiniteOpt.parameter_refs(xi) # queries the infinite parameters an input may depend on
if isempty(prefs) # is the input finite?
yi = JuMP.@variable(model; kwargs...)
else
yi = JuMP.@variable(model; variable_type = InfiniteOpt.Infinite(prefs...), kwargs...)
end
_set_bounds_if_finite(yi, lb, ub) # InfiniteOpt's version of this function (not an extension)
return yi
end
The add_predictor
function for ReLU
and ReLUBigM
would then be:
function add_predictor(model::JuMP.AbstractModel, ::ReLU, x::Vector)
ub = last.(get_variable_bounds.(x))
y = [add_activation_variable(model, x[i], 0, ub[i], base_name = "moai_ReLU[$i]") for i in eachindex(x)]
JuMP.@constraint(model, y .== max.(0, x))
return y
end
function add_predictor(model::JuMP.AbstractModel, predictor::ReLUBigM, x::Vector)
m = length(x)
bounds = get_variable_bounds.(x)
y = Vector{JuMP.variable_ref_type(model)}(undef, m)
for i in 1:m
y[i] = add_activation_variable(model, x[i], 0, last(bounds[i]), base_name = "moai_ReLU[$i]")
lb, ub = bounds[i]
z = add_activation_variable(model, x[i], nothing, nothing, binary = true)
JuMP.@constraint(model, y[i] >= x[i])
U = min(ub, predictor.M)
JuMP.@constraint(model, y[i] <= U * z)
L = min(max(0, -lb), predictor.M)
JuMP.@constraint(model, y[i] <= x[i] + L * (1 - z))
end
return y
end
This API would also readily be compatible with the other activation functions, binary decision trees and quantiles. Note that this also makes get_variable_bounds
an extension point so I could make it work with InfiniteOpt.GeneralVariableRef
s.
So then, the only layer that requires special treatment for full space is Affine
since each output is a linear combination of all the inputs which means that all outputs become infinite if any single input is. An API might look something like:
# Variable extension point
function add_affine_variable(model::JuMP.AbstractModel, x, lb, ub; kwargs...)
yi = JuMP.@variable(model; kwargs...)
_set_bounds_if_finite(yi, lb, ub)
return yi
end
# Method overloaded in the InfiniteOpt extension
function add_affine_variable(model::InfiniteOpt.InfiniteModel, x, lb, ub; kwargs...)
prefs = InfiniteOpt.parameter_refs(x)
if isempty(prefs) # are all of the inputs finite?
yi = JuMP.@variable(model; kwargs...)
else
yi = JuMP.@variable(model; variable_type = InfiniteOpt.Infinite(prefs...), kwargs...)
end
_set_bounds_if_finite(yi, lb, ub) # InfiniteOpt's version of this function
return yi
end
function add_predictor(model::JuMP.AbstractModel, predictor::Affine, x::Vector)
m = size(predictor.A, 1)
y = Vector{JuMP.variable_ref_type(model)}(undef, m)
bounds = get_variable_bounds.(x)
for i in 1:size(predictor.A, 1)
y_lb, y_ub = predictor.b[i], predictor.b[i]
for j in 1:size(predictor.A, 2)
a_ij = predictor.A[i, j]
lb, ub = bounds[j]
y_ub += a_ij * ifelse(a_ij >= 0, ub, lb)
y_lb += a_ij * ifelse(a_ij >= 0, lb, ub)
end
y[i] = add_affine_variable(model, x, y_lb, y_ub, base_name = "moai_Affine[$i]")
end
JuMP.@constraint(model, predictor.A * x .+ predictor.b .== y)
return y
end
So this is actually one reason why I didn't want to support AbstractModel
in the signatures. It is exactly the laissez-faire issue of https://github.com/lanl-ansi/MathOptAI.jl/blob/main/docs/src/developers/design_principles.md#inputs-are-vectors
Could you use the ReducedSpace
implementation in some way?
function add_predictor(model::InfiniteModel, predictor::AbstractPredictor, x::Vector)
y_expr = add_predictor(model, ReducedSpace(predictor), x)
y = @variable(model, [1:length(y)])
@constraint(model, y .== y_expr)
return y
end
Could you use the
ReducedSpace
implementation in some way?
Yes, but only for layers that support ReducedSpace
. The core issue here is supporting layers that don't such as ReLUBigM
, ReLUSOS1
, ReLUQuadratic
, and BinaryDecisionTree
which all need to create variables as part of their formulations.
For these, having an overloadable function like add_activation_variable
and get_variable_bounds
would provide a way to control how variables are created. This API could be clearly documented as being intended for only extension writers of JuMP.AbstractModel
s and should be done with caution.
Related to #83,
JuMP.AbstractModel
s like InfiniteOpt need to control how variables are generated during reformulation.Consider for instance, a two-stage stochastic program that uses a NN predictor model where we would like to do:
We would want
y
to bey(ξ)
since the input isx(ξ)
, but instead 1st stage variablesy
are returned since theInfinite(ξ)
tag is not used in@variable
behind the scenes. Interestingly, this problem is avoided with reduced space reformulationsSo, what I would need is some way to appropriately add tags to reformulation variables. Unfortunately, this might mean adding an extension point for each
add_predictor
method since their relationships between the inputs and the generated variables vary.Alternatively, I suppose I could overload every
add_predictor
method forInfiniteModel
s, but then I would end up having to copy nearly all the code (only tweaking how the variables are generated).