JuliaDiff / ForwardDiff.jl

Forward Mode Automatic Differentiation for Julia
Other
891 stars 143 forks source link

Create lightweight package AbstractDualNumbers or ForwardDiffBase or similar? #518

Open oschulz opened 3 years ago

oschulz commented 3 years ago

My apologies if this has been suggested before:

While ForwardDiff is not the heaviest of packages in the ecosystem, it's also not exactly lightweight (take 1.6 seconds to load on my system). A lightweight package AbstractDualNumbers.jl or ForwardDiffBase.jl (or similar) that just defines something like abstract type AbstractDualNumber{Tag} <: Real end and things like function AbstractDualNumbers.value end and function AbstractDualNumbers.partials end could allow packages to define custom push-forwards without depending on ForwardDiff itself.

I know that there are exciting efforts underway in the Julia-AD-ecosystem for new ADs (e.g. Diffractor), but ForwardDiff is certainly not going away any time soon. A really lightweight way to define push-forwards could reduce the frequency of @require ForwardDiff in the ecosystem quite a bit, and also make it possible to move code from packages like DistributionsAD to Distributions, etc.

dlfivefifty commented 3 years ago

Note we already have DualNumbers.jl. I believe the plan is to move ForwardDiff.Dual to there. (Unless there's a need for un-tagged dual numbers...)

hyrodium commented 2 years ago

x-ref https://github.com/JuliaDiff/DualNumbers.jl/issues/45

oschulz commented 2 years ago

JuliaDiff/DualNumbers.jl#45 would be nice ...

dlfivefifty commented 2 years ago

I think #45 is definitely the way to go, but ForwardDiff.Dual definitely needs some work to make it easy to use (beginning with pretty printing).

longemen3000 commented 2 years ago

i was looking at https://github.com/JuliaDiff/DualNumbers.jl/issues/45, https://github.com/JuliaDiff/DualNumbers.jl/pull/49 and the source code of DualNumbers.jl, and for hyphotetically embarking on such migration (lets call it FD DualNumber), i have some questions (and observations):

Are there any additional things to be done apart from the list above?

oschulz commented 2 years ago

Most of the load time of ForwardDiff is actually due to StaticArrays - that is, I think, only used for the Hessian, Jabobian, etc. functionality, so a package focused on dual-numbers should load very quickly.

oschulz commented 2 years ago

It is necessary for FD DualNumbers to support SpecialFunctions, NaNMath or Calculus

I think if it's lightweight enough there would be a chance to convince SpecialFunctions, NaNMath, etc. to support it, instead of the other way round.

oschulz commented 2 years ago

On the other part, SpecialFunctions already loads ChainRulesCore

Supporting ChainRulesCore would open so many doors. StatsFuns, for example, defines a lof of ChainRulesCore.@scalar_rules, but there are pretty much unusable at the moment because ForwardDiff doesn't utilize them.

mcabbott commented 2 years ago

One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore, as @scalar_rule could then define methods.

longemen3000 commented 2 years ago

Looking at DualNumbers.jl direct dependencies on github, not all of those have a dependency in their latest version:

oschulz commented 2 years ago

One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore, as @scalar_rule could then define methods.

Coming from you, that's almost an endorsement @mcabbott :-)

Maybe that's not that crazy at all? We don't want ChainRulesCore to become noticeably heavier, of course, now that it's making real inroads throughout the ecosystem - but maybe the cost wouldn't be high? We're currently at (Julia v1.8.0-beta3)

julia> @time_imports using ChainRulesCore
      3.1 ms  ┌ Compat
     58.1 ms  ChainRulesCore

If it's just 5 ms more or so, maybe that would be Ok? DualNumbers are quite fundamental after all - or at least will be once there's only one version of them around.

oschulz commented 2 years ago

One possibly crazy idea would be to move the minimal struct definitions to ChainRulesCore [...] If it's just 5 ms more or so, maybe that would be Ok?

The package load times do suggest a certain graph of package dependencies (run in sequence in a single session):

julia> @time_imports using ChainRulesCore
      3.2 ms  ┌ Compat
     63.2 ms  ChainRulesCore

julia> @time_imports using Calculus
      3.3 ms  Calculus

julia> @time_imports using NaNMath
      1.6 ms  NaNMath

julia> @time_imports using SpecialFunctions
      0.9 ms  ┌ ChangesOfVariables
      0.3 ms  ┌ OpenLibm_jll
      3.0 ms  ┌ DocStringExtensions
      4.1 ms  ┌ IrrationalConstants
      0.6 ms  ┌ CompilerSupportLibraries_jll
      1.4 ms  ┌ LogExpFunctions
     17.4 ms      ┌ Preferences
     18.0 ms    ┌ JLLWrappers
     21.4 ms  ┌ OpenSpecFun_jll
    131.1 ms  SpecialFunctions

julia> @time_imports using DualNumbers
     13.7 ms  DualNumbers

Especially SpecialFunctions should clearly depend on a dual-numbers package and not the other way round. :-) And ChainRulesCore depending on dual-numbers would seem quite natural as well. And the potential benefits would be huge - we would quickly get a lot more dual-numbers/ForwardDiff-support throughout the ecosystem (especially in the statistics sector - DistributionsAD could just go away completely - but also in many other domains).

devmotion commented 2 years ago

DistributionsAD could just go away completely

ForwardDiff is not the main blocker, it's Tracker and ReverseDiff. There are only very few definitions for dual numbers remaining: https://github.com/TuringLang/DistributionsAD.jl/blob/master/src/forwarddiff.jl

oschulz commented 2 years ago

DistributionsAD could just go away completely ForwardDiff is not the main blocker, it's Tracker and ReverseDiff.

Ah, sorry, you're right of course. (Full) ChainRulesCore-support in Tracker and ReverseDiff would be so nice ...

Speaking of the statistics domain there's StatsFuns, though, with several @scalar_rule's that could make the respective functions ForwardDiff-compatible.