[WIP] Fresh attempt at DI integration

Vaibhavdixit02 commented 4 months ago

Checklist

[ ] Appropriate tests were added
[ ] Any code changes were done in a way that does not break public API
[ ] All documentation related to code changes were updated
[ ] The new code follows the contributor guidelines, in particular the SciML Style Guide and COLPRAC.
[ ] Any new documentation only uses public API

Additional context

Add any other context about the problem here.

gdalle commented 4 months ago

You're running DI v0.4.2 here, something is blocking it from updating. Can you add a v0.5.2 compat bound and see what it says?

gdalle commented 4 months ago

The error sounds like a type instability?

gdalle commented 4 months ago

@Vaibhavdixit02 when I replace the function inside hessian! with sum it works, so I would assume the current behavior is due to the way you construct cons_oop with closures.

Vaibhavdixit02 commented 4 months ago

Yup, I figured that. I spent some time reading up on the vector -> vector hessian discussion (and the background) you and Tim Holy were having, hoping to do this more rigorously but I don't see a natural way, asking users to pass a vector of functions is not ideal and feels unnatural to me - this will show up in multiobjective problems as well, even though that's typically majorly derivative free algorithms. I have some WIP locally this is moving slow because of other things right now, I will get back to it soon.

gdalle commented 3 months ago

So do you think you need the vector Hessian?

Vaibhavdixit02 commented 3 months ago

No, not right now, I can work around that (as I have been doing in the implementations that exist here already) but it would be nice to have it in the future

wsmoses commented 3 months ago

@Vaibhavdixit02 @ChrisRackauckas please do not drop the Enzyme extension in favor of DI here.

This will lead to both more codes erring and suboptimal performance as DI will create both unnecessary closures among other issues. I would prefer for user code not to be made to break because of an unnecessary intermediate closure from an interface.

See some comments about this tpapp/LogDensityProblemsAD.jl#29 (review) / tpapp/LogDensityProblemsAD.jl#29 (comment)

Unfortunately when I commented "Closures were one of the reasons imo that AD.jl didn't take off, so long term I think DI.jl would be well served by both not introducing them itself (my understanding is that it does this successfully), but also not forcing users to create the same closures user-side (which would create similar issues)." but the response was "DI is unlikely to support either multiple arguments or activity specification anytime soon"

x/ref https://github.com/gdalle/DifferentiationInterface.jl/issues/307

gdalle commented 3 months ago

I am aware of this limitation of DifferentiationInterface but I'm not sure it's obvious to others how much work would be required for multiple arguments or activity analysis. The reason we're able to detect bugs and suboptimalities in Enzyme before anyone else (see the 100x gradient slowdown from a month ago) is because we have a top tier test suite, which would need to grow by a factor of at least 2 to accommodate this new feature. It's 1k to 2k LOCs at least and a profound rethink, so even if I want to do it, it can't happen overnight.

And since it's only essential for Enzyme, another option would be to provide the operators I need for DI on the Enzyme side. That would make it so I don't have to code every operator suboptimally with my imperfect understanding of your docs.

gdalle commented 3 months ago

Cross referencing for technical details

https://github.com/gdalle/DifferentiationInterface.jl/issues/311

gdalle commented 3 months ago

And I also want to push back on the "unnecessary interface intermediate". DI fills a longstanding need in the ecosystem, especially for packages like Enzyme where the API is very hard to grasp. Not everyone is gonna be able or willing to use Enzyme directly, so I do believe there is value in even a suboptimal DI.

Since I aim to support every backend and not just Enzyme, there's always gonna be a compromise on how far I support each one. I'm gonna look into activities, but I just need you to realize that it's easy to get it out quick and dirty, much harder to do it right

wsmoses commented 3 months ago

Ah I apologize @gdalle I did not mean that to refer to DI as unnecessary (I think it is an amazing package and the work that you and Adrian are doing to make things easier to use is sorely needed).

I meant only to refer to the intermediate closure that DI would generate for the hessian as unnecessary -- strictly in the technical sense (eg wrapping things into a closure isn't needed when you could call a multi argument version without the closure).

I realize in retrospect the poor wording and I sincerely apologize.

wsmoses commented 3 months ago

And I fully agree with you that it is a difficult thing to get multi argument and activities correct.

However, replacing existing code that does not introduce these will create breakage so I don't think it wise for DI to replace existing codes with such support until DI vendors support as well.

For the backends without this need -- or packages without generalization of backend -- it is already a strict improvement and isn't a question that its use is immediately useful (and thus would be good to replace the other backends here, for example).

gdalle commented 3 months ago

Thanks for clarifying, and sorry for overreacting on my end 😊 Maybe this PR can replace the other backends and leave Enzyme alone for the time being?

ChrisRackauckas commented 3 months ago

We have no intentions to change to DI until it matches the performance of what we already have. It's on our radar, but we will not sacrifice features or performance. Instead, we will keep on top of it until DI can do exactly this, and then it's ready.

gdalle commented 3 months ago

Which benchmarks am I expected to pass for this bar to be met? And what happens if I'm faster in some aspects but slower in others?

ChrisRackauckas commented 3 months ago

If it's anywhere close to the performance then we'll take it as we would greatly enjoy the decreased maintanance burden. But it should be able to do the standard constrained and unconstrained tutorials with Zygote, Enzyme, and ForwardDiff and basically match the performance all of the way through. If I'm not mistaken, currently the closures are a bit of a blocker to this.

gdalle commented 3 months ago

If you're talking about the tutorials in the docs, some of them are on toy problems (see e.g. https://docs.sciml.ai/Optimization/stable/tutorials/constraints/). Is there an actual realistic benchmark suite that can be run to track and compare performance?

ChrisRackauckas commented 3 months ago

The toy problems will be the hardest because that will measure pure overhead.

gdalle commented 3 months ago

But it should be able to do the standard constrained and unconstrained tutorials with Zygote, Enzyme, and ForwardDiff and basically match the performance all of the way through.

What I'm suggesting above, following the discussion with Billy, is that DI doesn't necessarily have to overtake all backends at once. It seems reasonable to leave Enzyme aside for now cause that's the one where I would struggle most to milk the last drops of performance.

gdalle commented 3 months ago

The toy problems will be the hardest because that will measure pure overhead.

Okay then. Ping me when the PR is ready for use @Vaibhavdixit02 and I'll review then help benchmark

gdalle commented 2 months ago

Beware of the local sparsity detector: you shouldn't have to use it in general.

julia> using ADTypes: jacobian_sparsity

julia> using SparseConnectivityTracer

julia> function con2_c(res, x)
           res .= [x[1]^2 + x[2]^2, x[2] * sin(x[1]) - x[1]]
       end
con2_c (generic function with 1 method)

julia> jacobian_sparsity(con2_c, zeros(2), zeros(2), TracerSparsityDetector()) # input-independent
2×2 SparseArrays.SparseMatrixCSC{Bool, Int64} with 4 stored entries:
 1  1
 1  1

julia> jacobian_sparsity(con2_c, zeros(2), zeros(2), TracerLocalSparsityDetector()) # input-dependent
2×2 SparseArrays.SparseMatrixCSC{Bool, Int64} with 3 stored entries:
 1  1
 1  ⋅

SciML / OptimizationBase.jl

[WIP] Fresh attempt at DI integration #54

Checklist

Additional context