Open gdalle opened 8 months ago
To elaborate on this:
As far as I understand, the @primitive
macro is used on pullbacks/pushforwards from individual backends to generate the following AD.jacobian
functions:
Forward-mode AD: https://github.com/JuliaDiff/AbstractDifferentiation.jl/blob/211b67528c5ed91971bb524c57adb63837163367/src/AbstractDifferentiation.jl#L600-L634
Reverse-mode AD: https://github.com/JuliaDiff/AbstractDifferentiation.jl/blob/211b67528c5ed91971bb524c57adb63837163367/src/AbstractDifferentiation.jl#L636-L663
These functions compute full Jacobians by evaluating the pullbacks/pushforwards on the standard basis (identity_like
).
By default, the fallback jacobian
function is empty (maybe this should be replaced by a NotImplementedError
):
As shown in the implementer guide, this jacobian
function is the fallback at the core of most functions exported by AbstractDifferentiation:
Taking reverse-mode AD as an example, the function dependency graph of value_and_pullback_function
would look as follows:
value_and_pullback_function
calls jacobian
jacobian
is an empty functionNow, when a reverse-mode AD backend is loaded, value_and_pullback_function
is defined for the backend and @primitive
is called on it, the function dependency graph is inverted:
value_and_pullback_function
calls the backendjacobian
calls value_and_pullback_function
The second behaviour is desired, as we wouldn't want to compute a full Jacobian just to compute a VJP when we can instead evaluate the pullback directly.
The fact that the function dependency graph is flipped was very confusing to me at first. A lot of hidden control flow is added via package extensions and the @primitive
macro, which currently isn't documented in the implementer guide.
Why is AD.jacobian
so central to AbstractDifferentiation.jl and why does it have to be generated via a macro? Can't it be implemented in a more generic way by making sure pullbacks and pushforward wrappers have consistent output types?
The only advantage I currently see is to allow users to
but those sound like things that should usually be avoided.
Why isn't AbstractDifferentiation.jl built around two primitives value_and_pullback_function
and value_and_pushforward
[^1] and making more liberal use of dispatch on the AbstractReverseMode
and AbstractForwardMode
types?
[^1]: Ideally with in-place mutating variants.
Duplicate of https://github.com/JuliaDiff/AbstractDifferentiation.jl/issues/13, or at least https://github.com/JuliaDiff/AbstractDifferentiation.jl/issues/13#issuecomment-1664642912 and the following discussion?
Why is AD.jacobian so central to AbstractDifferentiation.jl Why isn't AbstractDifferentiation.jl built around two
primitives value_and_pullback_function
andvalue_and_pushforward
Historical reasons based mainly on the original author have a strong enough understanding of the calculus involved, but not such a strong understanding of autodiff or julia abstractions, IIRC. And the priority being on getting something out that worked and was usable. It should be.
This issue is my fault. Feel free to remove the macro if it makes things simpler.
BTW, regarding
Why is AD.jacobian so central to AbstractDifferentiation.jl and why does it have to be generated via a macro? Can't it be implemented in a more generic way by making sure pullbacks and pushforward wrappers have consistent output types?
https://github.com/JuliaDiff/AbstractDifferentiation.jl/pull/95 trimmed down the macro, it can only be used anymore to implement the jacobian based on a pushforward_function or a value_and_pullback_function. Support for ReverseDiff and FiniteDifferences is implemented without the macro already, and e.g. ForwardDiff uses the automatically constructed jacobian function only for functions with multiple arguments (the single-argument version just calls ForwardDiff.jacobian
).
As I mentioned in https://github.com/JuliaDiff/AbstractDifferentiation.jl/issues/13#issuecomment-1994191014 and https://github.com/JuliaDiff/AbstractDifferentiation.jl/issues/123#issuecomment-1880412967, I am ok with removing the macro. It is currently a thin wrapper over a pushforward or pullback definition. Feel free to open a PR.
I was chatting with @adrhill and he suggested that the macro
@primitive
could be discarded if each backend simply implemented some methods from AbstractDifferentiation, mostlyjacobian
and apushforward
orpullback
. Thoughts?