jump-dev / MathOptInterface.jl

A data structure for mathematical optimization problems
http://jump.dev/MathOptInterface.jl/
Other
380 stars 86 forks source link

[Nonlinear] detect common subexpressions #2488

Closed odow closed 2 months ago

odow commented 3 months ago

There is tooling in MOI.Nonlinear.ReverseAD to exploit common subexpressions, but we don't actively exploit this when parsing ScalarNonlinearFunction.

I wonder if we could walk the tape somehow to detect and reduce common subexpressions that are at least N nodes long.

It would help: https://github.com/jump-dev/JuMP.jl/issues/3729#issuecomment-2060549713

odow commented 3 months ago

Some possible options:

  1. Create a single unified expression graph (as opposed to the current expression tree). Change quite a lot of how ReverseAD processes functions etc.
  2. Don't do anything. Add a new decision variable with y = f(x) for expressions. This makes things very explicit.
  3. Add something new type/index to JuMP/MOI. But we're breaking the abstraction quite a bit.
  4. Add a univariate :subexpression operator to MOI. So: @expression(model, expr, subexpression(sin(x) + 1)). Solvers would see this during parsing, and could choose to ignore, or store it in a common expression.

I don't have a strong opinion yet. But our current approach requires us to smack the modeler on the hand and tell them they're holding it wrong and that they should do (2): https://github.com/jump-dev/JuMP.jl/issues/3729#issuecomment-2060586587

odow commented 2 months ago

Another example is https://github.com/jump-dev/MathOptInterface.jl/issues/2496

Downsite commented 2 months ago

I can't comment on the effect on AD. However, if this also would concern the form in which problem is given to (global) solvers, I have some thoughts:

odow commented 2 months ago

doing (2) might have

I was imagining that we would ask the user to do this, not JuMP or MathOptIntnerface.

My experience with using DAGs to eliminate common subexpressions is that one can very easily create situations where parsing larger problems becomes very expensive.

Yes, this is one reason why we don't do this at the JuMP level. We very explicitly choose to do the simplest possible thing, even if it means that it isn't the best thing to do inn a bunch of cases.

(4), seems neat, although I do not quite get the difference

Yeah, to be decided. It's really just a syntax decision of how to communicate potential subexpressions to the solver.

odow commented 2 months ago

Closing in favor of https://github.com/jump-dev/JuMP.jl/issues/3738. No need to duplicate our discussions.