JuliaDiff / AbstractDifferentiation.jl

An abstract interface for automatic differentiation.
https://juliadiff.org/AbstractDifferentiation.jl/
MIT License
135 stars 18 forks source link

Comparison with DifferentiationInterface.jl #131

Open gdalle opened 6 months ago

gdalle commented 6 months ago

Hi there,

Adrian Hill (@adrhill) and I recently started a new unified differentiation package called DifferentiationInterface.jl:

https://github.com/gdalle/DifferentiationInterface.jl

I'm gonna try to show why we did that, how we proceeded, and what we expect to gain from it. The discussion had started in #129 but did not go very far.

The main idea was to be less generic than AbstractDifferentiation from the start, in order to reduce the surface of error and focus on testing + performance. This led to severe feature regressions and a complete rewrite, which is why I felt it was best to start a new project.

Whenever I say below that something is not implemented in AbstractDifferentiation, I don't mean it can never be. I just mean it might require significant changes, not unlike the ones Adrian and I proposed.

Design principles

AbstractDifferentiation DifferentiationInterface Reason Refs
Backend specification Custom structs Borrowed from ADTypes.jl Used by SciML and Lux.jl #88
Backend requirements @primitive macro Method implementation Easier to understand #13
Fallback mechanism Based on jacobian Based on pushforward / pullback Makes more theoretical sense #123
Fallback implementations Lots of closures No closure at all Type stability #109 #121
Differentiation order First and second order First and second order Enough for all practical purposes
Input / output number Anything One in, one out No need to deal with tuples or concatenate #53 #65 #99
Input / output types Anything AbstractArray or Number Some backends can't do more anyway + helps simplify
Differentiation cache Not yet Yes Performance with ForwardDiff / ReverseDiff #41
Mutating functions Not yet Yes Performance with Enzyme and others #14
Sparsity Not yet Yes Natural with ADTypes and SparseDiffTools

Features

AbstractDifferentiation DifferentiationInterface Reason Refs
Enzyme compatibility Not yet Yes Easier with one array in, one array out #40 #84 #85
Benchmarking utilities (for users) Not yet Yes Easier with one array in, one array out #45
Testing utilities (for users) Not yet Yes Easier with one array in, one array out #45

Performance comparison

I'm gonna add AbstractDifferentiation as an extension to DifferentiationInterface, so that we can include its backends in our benchmark suite and compare. The results will be posted here in the coming days.

mohamed82008 commented 6 months ago

Thanks for the summary. I will respond to each point below.

  1. Regarding ADTypes.jl, AbstractDifferentiation.jl predates that package. But we can of course support their AD types with an extension if needed or vice versa.
  2. The @primitive macro is a feature but it is not essential to use. You can use method implementation to define the behaviour for a new AD backend.
  3. The fallback is no longer based on Jacobian.
  4. Regarding the use of closures, their use may be reduced and improved in AD.jl. But we need specific examples where closures gave a sub-optimal performance or type stability issues. Also for most backends now, we encourage using the native implementation of each AD package for any exported AD.jl function. This means that closure fallbacks are only useful when the backend package doesn't provide an implementation for such a function. In this case, we can strive to improve the closure implementation. Specific issues and PRs are welcome.
  5. Regarding being first order only, this is a strict limitation of DI.jl.
  6. Regarding, having one vector in and one out, there is no reason why we need to support multiple inputs and multiple outputs in AD.jl if the backend package doesn't. We can start with a single input and single output version. In fact, the ForwardDiff extension assumes a single input and single output. Same response for the types.
  7. Regarding caching and mutating functions, I am open to proposals. Perhaps the implementation in DI.jl can be used to design a caching and mutating API for AD.jl.
  8. Regarding Enzyme support, if DI.jl can support Enzyme then I am open to contributing the features needed for that support back to AD.jl
gdalle commented 5 months ago

I just updated the comparison to include the new features we added these past few weeks (mutating functions, second order, testing and benchmarking utilities)