SciML / OptimizationBase.jl

The base package for Optimization.jl, containing the structs and basic functions for it.
MIT License
8 stars 5 forks source link

OptimizationBase v2.0 plan #62

Open Vaibhavdixit02 opened 2 weeks ago

Vaibhavdixit02 commented 2 weeks ago

The current AD implementation here (at least) lacks these features/suffers from these issues -

  1. MOAR!! oracles: #10, jvp and vjp for constraints, #61, #50
  2. Redundant function evaluations: #22
  3. Consistent support for exotic array types: #14 #7
  4. Some more backends: #21 #32 and FastDifferentiation
  5. Defaults
  6. Mixing AD backends when it makes sense

The reason these are outstanding has been lower priority and tediousness due to having to do it for multiple backends all the time. Hence, the emergence of DI is a timely solution for a bunch of these.

So my aim with #54 has been to start fleshing out some of the ideas of how these would look, we obviously can't expect DI to be the solution for everything but some of these are right up in its alley and it gives an excuse to spend time rethinking the implementation.

I plan to address #10, jvp and vjp for constraints, points 2, 3, #21 and FastDifferentiation, 5, and 6 in #54. I am hopeful to get this in by (at the absolute maximum) end of summer.

Vaibhavdixit02 commented 2 weeks ago

It might also make sense to do a v2.0 branch and start doing incremental PRs for these that build off #54 to that

gdalle commented 2 weeks ago

From my perspective, it would make sense to integrate DifferentiationInterface as early as possible, even if it doesn't do everything yet. The reason is that a large stress test like this would be great to spot bugs and inefficiencies.

[!WARNING]
Not everything is gonna be perfect in DI from the start. But at least if we start using it, we can optimize every implementation in just one place instead of several.

MOAR!! oracles

The Lagrangian hessian is a recurrent concern, a possible solution is https://github.com/gdalle/DifferentiationInterface.jl/issues/206 but I think it's a bit overkill. https://github.com/gdalle/DifferentiationInterface.jl/issues/311 is more reasonable to allow passing the multipliers as a constant parameters.

jvp and vjp for constraints

DI has pushforward and pullback

Redundant function evaluations

DI has value_and_gradient and now value_gradient_and_hessian

Consistent support for exotic array types

DITest has a battery of scenarios involving static arrays for testing. Non-mutating operators are written so that even SArray will work (but more testing won't hurt).

As for sparsity, what you really need is SparseConnectivityTracer + DifferentiationInterface. Our sparsity detection is 10-100x faster than Symbolics, our coloring seems on par with the C++ ColPack. Last thing I need to do is benchmark against SparseDiffTools.

Some more backends

DI does all the work for you there.

Mixing AD backends when it makes sense

DI has SecondOrder