As it stands, calling AD.derivative for FiniteDifferences and ForwardDiff back-ends first calculates the jacobian and then flattens it into the derivative. For a few edge cases, say a single-input function, this is significantly slower:
using FiniteDifferences, BenchmarkTools
import AbstractDifferentiation as AD
fdm = central_fdm(2,1,adapt=0)
fd = AD.FiniteDifferencesBackend(fdm)
with_AD(x) = AD.derivative(fd,sin,x)
without_AD(x) = fdm(sin,x)
blame_the_jacobian(x) = jacobian(fdm,sin,x)
@benchmark with_AD(1.)
@benchmark without_AD(1.)
@benchmark blame_the_jacobian(1.)
This is also the case for other, less silly examples like small neural networks with a single input. What are the reasons for not implementing the derivative directly? Something along the lines of:
function AD.derivative(ba::AD.FiniteDifferencesBackend, f, xs...)
return (ba.method(f, xs...),)
end
Hey all.
As it stands, calling AD.derivative for FiniteDifferences and ForwardDiff back-ends first calculates the jacobian and then flattens it into the derivative. For a few edge cases, say a single-input function, this is significantly slower:
This is also the case for other, less silly examples like small neural networks with a single input. What are the reasons for not implementing the derivative directly? Something along the lines of: