Comparison with DifferentiationInterface.jl

gdalle commented 6 months ago

Hi there,

Adrian Hill (@adrhill) and I recently started a new unified differentiation package called DifferentiationInterface.jl:

https://github.com/gdalle/DifferentiationInterface.jl

I'm gonna try to show why we did that, how we proceeded, and what we expect to gain from it. The discussion had started in #129 but did not go very far.

The main idea was to be less generic than AbstractDifferentiation from the start, in order to reduce the surface of error and focus on testing + performance. This led to severe feature regressions and a complete rewrite, which is why I felt it was best to start a new project.

Whenever I say below that something is not implemented in AbstractDifferentiation, I don't mean it can never be. I just mean it might require significant changes, not unlike the ones Adrian and I proposed.

Design principles

	AbstractDifferentiation	DifferentiationInterface	Reason	Refs
Backend specification	Custom structs	Borrowed from ADTypes.jl	Used by SciML and Lux.jl	#88
Backend requirements	`@primitive` macro	Method implementation	Easier to understand	#13
Fallback mechanism	Based on `jacobian`	Based on `pushforward` / `pullback`	Makes more theoretical sense	#123
Fallback implementations	Lots of closures	No closure at all	Type stability	#109 #121
Differentiation order	First and second order	First and second order	Enough for all practical purposes
Input / output number	Anything	One in, one out	No need to deal with tuples or concatenate	#53 #65 #99
Input / output types	Anything	`AbstractArray` or `Number`	Some backends can't do more anyway + helps simplify
Differentiation cache	Not yet	Yes	Performance with ForwardDiff / ReverseDiff	#41
Mutating functions	Not yet	Yes	Performance with Enzyme and others	#14
Sparsity	Not yet	Yes	Natural with ADTypes and SparseDiffTools

Features

	AbstractDifferentiation	DifferentiationInterface	Reason	Refs
Enzyme compatibility	Not yet	Yes	Easier with one array in, one array out	#40 #84 #85
Benchmarking utilities (for users)	Not yet	Yes	Easier with one array in, one array out	#45
Testing utilities (for users)	Not yet	Yes	Easier with one array in, one array out	#45

Performance comparison

I'm gonna add AbstractDifferentiation as an extension to DifferentiationInterface, so that we can include its backends in our benchmark suite and compare. The results will be posted here in the coming days.

mohamed82008 commented 6 months ago

Thanks for the summary. I will respond to each point below.

Regarding ADTypes.jl, AbstractDifferentiation.jl predates that package. But we can of course support their AD types with an extension if needed or vice versa.
The @primitive macro is a feature but it is not essential to use. You can use method implementation to define the behaviour for a new AD backend.
The fallback is no longer based on Jacobian.
Regarding the use of closures, their use may be reduced and improved in AD.jl. But we need specific examples where closures gave a sub-optimal performance or type stability issues. Also for most backends now, we encourage using the native implementation of each AD package for any exported AD.jl function. This means that closure fallbacks are only useful when the backend package doesn't provide an implementation for such a function. In this case, we can strive to improve the closure implementation. Specific issues and PRs are welcome.
Regarding being first order only, this is a strict limitation of DI.jl.
Regarding, having one vector in and one out, there is no reason why we need to support multiple inputs and multiple outputs in AD.jl if the backend package doesn't. We can start with a single input and single output version. In fact, the ForwardDiff extension assumes a single input and single output. Same response for the types.
Regarding caching and mutating functions, I am open to proposals. Perhaps the implementation in DI.jl can be used to design a caching and mutating API for AD.jl.
Regarding Enzyme support, if DI.jl can support Enzyme then I am open to contributing the features needed for that support back to AD.jl

gdalle commented 5 months ago

I just updated the comparison to include the new features we added these past few weeks (mutating functions, second order, testing and benchmarking utilities)

JuliaDiff / AbstractDifferentiation.jl