Open oschulz opened 3 years ago
Note we already have DualNumbers.jl. I believe the plan is to move ForwardDiff.Dual
to there. (Unless there's a need for un-tagged dual numbers...)
JuliaDiff/DualNumbers.jl#45 would be nice ...
I think #45 is definitely the way to go, but ForwardDiff.Dual definitely needs some work to make it easy to use (beginning with pretty printing).
i was looking at https://github.com/JuliaDiff/DualNumbers.jl/issues/45, https://github.com/JuliaDiff/DualNumbers.jl/pull/49 and the source code of DualNumbers.jl, and for hyphotetically embarking on such migration (lets call it FD DualNumber), i have some questions (and observations):
Before you go down this route, be warned that it will probably involve a good bit of work and digging into the implementation details of both packages. I had planned on doing this work myself after v0.6 released to avoid having to continuously update with breakage/depwarn fixes.
We need the change-over to not merely swap out the old implementation for ForwardDiff's, but to ensure that the feature sets of the two implementations are appropriately merged. We'll wish to drop some of the old behaviors, while other behaviors we'll wish to preserve, probably requiring new definitions. For example, there are some primitives defined on
DualNumbers.Dual
that are not yet defined onForwardDiff.Dual
. There might be more subtle behavioral changes as well.Things that have to be done (besides just porting over the code):
- [ ] Implement a deprecation layer
- [ ] Implement whatever new functionality we need to appropriately merge the behavior of the two implementations
- [ ] Write new tests for any new definitions
- [ ] Write documentation describing the new interface
I'd also like my name added to the LICENSE (and I believe @mlubin is also within his rights to request this, but I'll let him speak for himself). I believe doing this requires Theo's permission?
Are there any additional things to be done apart from the list above?
Most of the load time of ForwardDiff is actually due to StaticArrays - that is, I think, only used for the Hessian, Jabobian, etc. functionality, so a package focused on dual-numbers should load very quickly.
It is necessary for FD DualNumbers to support SpecialFunctions, NaNMath or Calculus
I think if it's lightweight enough there would be a chance to convince SpecialFunctions, NaNMath, etc. to support it, instead of the other way round.
On the other part, SpecialFunctions already loads ChainRulesCore
Supporting ChainRulesCore would open so many doors. StatsFuns, for example, defines a lof of ChainRulesCore.@scalar_rule
s, but there are pretty much unusable at the moment because ForwardDiff doesn't utilize them.
One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore, as @scalar_rule
could then define methods.
Looking at DualNumbers.jl direct dependencies on github, not all of those have a dependency in their latest version:
One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore, as @scalar_rule could then define methods.
Coming from you, that's almost an endorsement @mcabbott :-)
Maybe that's not that crazy at all? We don't want ChainRulesCore to become noticeably heavier, of course, now that it's making real inroads throughout the ecosystem - but maybe the cost wouldn't be high? We're currently at (Julia v1.8.0-beta3)
julia> @time_imports using ChainRulesCore
3.1 ms ┌ Compat
58.1 ms ChainRulesCore
If it's just 5 ms more or so, maybe that would be Ok? DualNumbers are quite fundamental after all - or at least will be once there's only one version of them around.
One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore [...] If it's just 5 ms more or so, maybe that would be Ok?
The package load times do suggest a certain graph of package dependencies (run in sequence in a single session):
julia> @time_imports using ChainRulesCore
3.2 ms ┌ Compat
63.2 ms ChainRulesCore
julia> @time_imports using Calculus
3.3 ms Calculus
julia> @time_imports using NaNMath
1.6 ms NaNMath
julia> @time_imports using SpecialFunctions
0.9 ms ┌ ChangesOfVariables
0.3 ms ┌ OpenLibm_jll
3.0 ms ┌ DocStringExtensions
4.1 ms ┌ IrrationalConstants
0.6 ms ┌ CompilerSupportLibraries_jll
1.4 ms ┌ LogExpFunctions
17.4 ms ┌ Preferences
18.0 ms ┌ JLLWrappers
21.4 ms ┌ OpenSpecFun_jll
131.1 ms SpecialFunctions
julia> @time_imports using DualNumbers
13.7 ms DualNumbers
Especially SpecialFunctions should clearly depend on a dual-numbers package and not the other way round. :-) And ChainRulesCore depending on dual-numbers would seem quite natural as well. And the potential benefits would be huge - we would quickly get a lot more dual-numbers/ForwardDiff-support throughout the ecosystem (especially in the statistics sector - DistributionsAD could just go away completely - but also in many other domains).
DistributionsAD could just go away completely
ForwardDiff is not the main blocker, it's Tracker and ReverseDiff. There are only very few definitions for dual numbers remaining: https://github.com/TuringLang/DistributionsAD.jl/blob/master/src/forwarddiff.jl
DistributionsAD could just go away completely ForwardDiff is not the main blocker, it's Tracker and ReverseDiff.
Ah, sorry, you're right of course. (Full) ChainRulesCore-support in Tracker and ReverseDiff would be so nice ...
Speaking of the statistics domain there's StatsFuns, though, with several @scalar_rule
's that could make the respective functions ForwardDiff-compatible.
My apologies if this has been suggested before:
While ForwardDiff is not the heaviest of packages in the ecosystem, it's also not exactly lightweight (take 1.6 seconds to load on my system). A lightweight package AbstractDualNumbers.jl or ForwardDiffBase.jl (or similar) that just defines something like
abstract type AbstractDualNumber{Tag} <: Real end
and things likefunction AbstractDualNumbers.value end
andfunction AbstractDualNumbers.partials end
could allow packages to define custom push-forwards without depending on ForwardDiff itself.I know that there are exciting efforts underway in the Julia-AD-ecosystem for new ADs (e.g. Diffractor), but ForwardDiff is certainly not going away any time soon. A really lightweight way to define push-forwards could reduce the frequency of
@require ForwardDiff
in the ecosystem quite a bit, and also make it possible to move code from packages like DistributionsAD to Distributions, etc.