Open odow opened 9 months ago
For vector-outputs, we could potentially do something like this:
A = [1 2; 3 4]
x = [MOI.VariableIndex(1), MOI.VariableIndex(2)]
f(x) = sum(A * x)
| i | node_type | parent | value |
| 1 | SCALAR_CALL | -1 | OP_SUM |
| 2 | RESULT_ARRAY | 1 | (2, 0) |
| 3 | RESULT | 2 | NaN |
| 4 | RESULT | 2 | NaN |
| 5 | VECTOR_CALL | 2 | OP_MUL |
| 6 | ARRAY | 5 | (2, 2) |
| 7 | CONSTANT | 6 | 1 |
| 8 | CONSTANT | 6 | 2 |
| 9 | CONSTANT | 6 | 3 |
| 10 | CONSTANT | 6 | 4 |
| 11 | ARRAY | 5 | (2, 0) |
| 12 | VARIABLE | 11 | 1 |
| 13 | VARIABLE | 11 | 2 |
[1.0, 3.0, 2.0, 4.0]
@rluce, we could have dropped the .values
list if you re-interpret the .value
field of a NODE_VALUE
to be float64
instead of an int64
index into the array of float64
...
That'd simplify things to:
| i | node_type | parent | value |
| 1 | CALL | -1 | OP_SUM |
| 2 | RESULT_ARRAY | 1 | (2, 0) |
| 3 | RESULT | 2 | NaN |
| 4 | RESULT | 2 | NaN |
| 5 | CALL | 2 | OP_MUL |
| 6 | ARRAY | 5 | (2, 2) |
| 7 | CONSTANT | 6 | 1.0 |
| 8 | CONSTANT | 6 | 2.0 |
| 9 | CONSTANT | 6 | 3.0 |
| 10 | CONSTANT | 6 | 4.0 |
| 11 | ARRAY | 5 | (2, 0) |
| 12 | VARIABLE | 11 | 1 |
| 13 | VARIABLE | 11 | 2 |
Gurobi will support array-valued nonlinear expressions and also needs JuMP to compute derivatives? Not sure I understand what the interface is.
Gurobi will support array-valued nonlinear expressions and also needs JuMP to compute derivatives?
No. They're going to do their own thing, and won't support arrays to start with.
But their interface is going to look very similar to the MOI data structure so we were just riffing on some ideas.
It'll look something like
GRBaddnonlinearexpression(
grb,
result_variable::Int,
n_nodes::Int,
n_data::Int,
node_types::Vector{Int},
node_parents::Vector{Int},
node_values::Vector{Int},
data::Vector{Float64},
)
Correct, no vector in- our output planned for Gurobi, but the concept is very tempting 😄 .
But matrices are a bit more complicated. We'd need to encode (rows, cols). One option would be to store the size as a packed value::Int64
Another option could be to put dimensionality information in the value array, viz.
A = [1 2; 3 4]
x = [MOI.VariableIndex(1), MOI.VariableIndex(2)]
f(x) = sum(A * x)
| i | node_type | parent | value |
| 1 | SCALAR_CALL | -1 | OP_SUM |
| 2 | RESULT_ARRAY | 1 | 9 |
| 3 | RESULT | 2 | NaN |
| 4 | RESULT | 2 | NaN |
| 5 | VECTOR_CALL | 2 | OP_MUL |
| 6 | ARRAY | 5 | 7 |
| 7 | CONSTANT | 6 | 1 |
| 8 | CONSTANT | 6 | 2 |
| 9 | CONSTANT | 6 | 3 |
| 10 | CONSTANT | 6 | 4 |
| 11 | ARRAY | 5 | 5 |
| 12 | VARIABLE | 11 | 1 |
| 13 | VARIABLE | 11 | 2 |
[1.0, 3.0, 2.0, 4.0, 2.0, 0.0, 2.0, 2.0, 2.0, 0.0]
with the understanding that an ARRAY
node will always read 2 consecutive values from the value
array. Not pretty, but maybe more consistent with the idea that value
is the place to store static data.
I guess there are multiple options for the size. It probably doesn't really matter which one you go with, so long as it is consistent.
The other option, instead of directly coding RESULT
values, is to have a OP_GET_INDEX
node which can lookup into the array.
| i | OP_GET_INDEX | 2 | 1 |
The benefit of this is that you don't need all the RESULT
rows. but the downside is that on the reverse pass you'd need to allocate storage for the array ops, where as the RESULT
rows could be filled in-place like normal.
Context: Yesterday I had a chat to @rluce about nonlinear expressions, particularly as they relate to Gurobi's upcoming nonlinear interface. We broadly agree on scalar nonlinear functions, and he had some preliminary ideas for vector-inputs.
cc @ccoffrin and @chriscoey for thoughts.
Our ultimate goal is to support examples like the following:
Another key consumer of this would be https://github.com/jump-dev/MiniZinc.jl.
Changes required in JuMP
Array
inGenericNonlinearExpr
and their mapping tomoi_function
. This seems pretty easy.Changes required in MOI
Array
inMOI.(Scalar,Vector)NonlinearFunction
. This seems pretty easy.Array
inMOI.Nonlinear
and be able to compute derivatives, etc.The tricky part is all in 2.
We'd likely need some sort of
Node(NODE_VECTOR, parent, n)
.But matrices are a bit more complicated. We'd need to encode
(rows, cols)
. One option would be to store the size as a packedvalue::Int64
. That'd mean that we couldn't have matrices with side dimension greater thantypemax(Int32)
... but that seems okay.norm(x)
would look something like:log_det(X)
would then look something like:Once you have the data structure, it seems to me that the AD should follow fairly well.
Output arguments
Still absolutely no idea.