Closed prbzrg closed 8 months ago
I'm not sure there's anything to be fixed here. In forward mode usually you get both the primal and the dual together vs in reverse mode you get the value and the pullback function (see, e.g., ForwardDiff or ChainRulesCore.frule vs Zygote or ChainRulesCore.rrule).
IMO, having a backend-agnostic interface means we shouldn't rely on backend APIs or mirror them. Moreover, every other value_and_AD APIs in AbstractDifferentiation.jl
return value and AD.
And why do we return the same value every time?
IMO the names are possibly confusing but not wrong per se - value_and_X_function
can be read both as (value_and_X)_function
and value_and_(X_function)
. For forward mode we use the former interpretation and for reverse mode the latter.
AFAICT this difference between forward and reverse mode is not based on a specific package but a conceptual difference: In forward mode, you can (and usually do) compute both the primal value and the partial derivatives in a single forward pass whereas in reverse mode you compute the primal value in the forward pass, construct the pullback functions as you go, and then compute the derivatives in the backward pass.
Thanks for your response.
The docstring is misleading then, see #124
That might be my fault from the recent docstring overhaul
value_and_pushforward_function
returns a function that returns(value, pf)
https://github.com/JuliaDiff/AbstractDifferentiation.jl/blob/211b67528c5ed91971bb524c57adb63837163367/src/AbstractDifferentiation.jl#L259 butvalue_and_pullback_function
returns(value, pullback_function)
https://github.com/JuliaDiff/AbstractDifferentiation.jl/blob/211b67528c5ed91971bb524c57adb63837163367/src/AbstractDifferentiation.jl#L319