JuliaDiff / ForwardDiff.jl

Forward Mode Automatic Differentiation for Julia
Other
893 stars 148 forks source link

confused by this root finding example #595

Open tcovert opened 2 years ago

tcovert commented 2 years ago

I have some code which calls a root finding step, via the Roots package that I'd like to be able to differentiate. It appears that whether ForwardDiff "works" in this setting depends on some subtlety that I don't quite understand.

For an MWE, consider finding the (unique) root of the cubic function f(x, k) = (x-k)^3 which has a root at x = k. Let g1(k) = find_zero(x -> f(x, k), [k-1.0,k+1.0]) and similarly let g2(k) = find_zero(x -> f(x, k), [10.0,+10.0]). Note that for values of k between -10 to +10, g1(k) == g2(k). In this range, the only difference between g1 and g2 is that the root finding step in g1 is "aware" of the value of k to look for, via the bounds I have provided, whereas it is ignorant of them in g2. For values of k in this range, the derivatives of g1 and g2 are identical, and equal to 1.0. However, I am seeing that ForwardDiff thinks the derivative of g2 is zero, while it finds the derivative of g1 to be 1.0:

Using Roots, ForwardDiff, FiniteDiff
g0(x,k) = (x-k)^3
g1(k) = find_zero(x -> g0(x,k), [-10.0, +10.0])
g2(k) = find_zero(x -> g0(x,k), [-1.0 * k, +1.0 * k])

Julia reports that the functions are the same, and do vary with the input in the expected way:

ulia> g1(3)
3.0

julia> g2(3)
3.0

julia> g1(4)
4.0

julia> g2(4)
4.0

However, the ForwardDiff gradients are different:

julia> ForwardDiff.derivative(g1, 3.0)
0.0

julia> ForwardDiff.derivative(g2, 3.0)
1.0

Note that FiniteDiff works fine here (up to numerical approximation):

julia> FiniteDiff.finite_difference_derivative(g1, 3.0)
0.9999999999991082

julia> FiniteDiff.finite_difference_derivative(g2, 3.0)
0.9999999999991082

Thanks in advance for any suggestions the ForwardDiff team has in understanding this difference.

devmotion commented 2 years ago

Maybe that's actually a problem with Roots and the same problem as described in https://github.com/JuliaMath/Roots.jl/issues/314 (ForwardDiff derivative depends on the number of iterations)?