JuliaDiff / AbstractDifferentiation.jl

An abstract interface for automatic differentiation.
https://juliadiff.org/AbstractDifferentiation.jl/
MIT License
135 stars 18 forks source link

Use `Zygote.jacobian` etc. #128

Closed devmotion closed 7 months ago

devmotion commented 7 months ago

Zygote already has an interface for pullbacks, jacobians, etc., hence IMO we should use it in AD.jacobian etc. as I guess users (arguably correctly) expect that AD.jacobian performs the same computations and is as efficient as Zygote.jacobian. That is, I think AbstractDifferentiation should not treat Zygote as "just" an example of ChainRules AD backend but rather as a separate entity as - I think - we should not redefine jacobian etc. for Zygote based on the ChainRules pullback with Zygote.ZygoteRuleConfig.

Fixes #54.

codecov[bot] commented 7 months ago

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (afec712) 82.32% compared to head (202c86f) 82.54%.

Files Patch % Lines
ext/AbstractDifferentiationZygoteExt.jl 90.90% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #128 +/- ## ========================================== + Coverage 82.32% 82.54% +0.22% ========================================== Files 8 8 Lines 413 424 +11 ========================================== + Hits 340 350 +10 - Misses 73 74 +1 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

mohamed82008 commented 7 months ago

IIRC, one fundamental difference is that Zygote.jacobian preallocates the Jacobian matrix. This is not good for higher order derivatives with reverse mode because it is mutating.

devmotion commented 7 months ago

My feeling is that this is a problem that should be addressed in Zygote (possibly they should use the default implementation in AbstractDifferentiation based on value_and_pullback_function but that's their decision). Based on the number of open (and closed) issues about AbstractDifferentiation taking different code paths, and how it leads to sometimes different behaviour and sometimes different (typically worse) performance, I think if a function such as jacobian, hessian etc. exists in an AD backend it should be used by AbstractDifferentiation.

gdalle commented 7 months ago

I agree with this take, and I push it further here: https://github.com/JuliaDiff/AbstractDifferentiation.jl/issues/129

mohamed82008 commented 7 months ago

Zygote.hessian is arguably even more questionable because it actually uses ForwardDiff for the second derivative.

devmotion commented 7 months ago

My view is the same there as well: If Zygote.hessian is broken/bad, an issue should be opened in Zygote and possibly they might want to switch to the hessian definition for pullbacks in AbstractDifferentiation. But we should not fix AD backends in AbstractDifferentiation.

devmotion commented 7 months ago

Bump 🙂

For other AD backends such as ForwardDiff, ReverseDiff, and Tracker actually we already use jacobian, hessian, etc. in downstream packages when applicable.