JuliaDiff / FiniteDifferences.jl

High accuracy derivatives, estimated via numerical finite differences (formerly FDM.jl)
MIT License
296 stars 26 forks source link

error when trying to futher differentiate using Zygote #176

Open btx0424 opened 3 years ago

btx0424 commented 3 years ago

Summary: it seems that the finite difference can't be further differentiated by Zygote Example: If I do

dif(x) = central_fdm(2, 1)(sin, x)

and then call

dif'(0)

I get the following error message:

setindex!(::StaticArrays.SVector{3, Float64}, value, ::Int) is not defined.

error(::String)@error.jl:33
setindex!(::StaticArrays.SVector{3, Float64}, ::Float64, ::Int64)@indexing.jl:3
#637@array.jl:318[inlined]
#2710#back@adjoint.jl:59[inlined]
Pullback@methods.jl:385[inlined]
(::typeof(∂(_estimate_magnitudes)))(::Tuple{Float64, Nothing})@interface2.jl:0
Pullback@methods.jl:365[inlined]
(::typeof(∂(estimate_step)))(::Tuple{Float64, Nothing})@interface2.jl:0
Pullback@methods.jl:193[inlined]
(::typeof(∂(λ)))(::Float64)@interface2.jl:0
Pullback@Other: 1[inlined]
(::typeof(∂(dif)))(::Float64)@interface2.jl:0
(::Zygote.var"#41#42"{typeof(∂(dif))})(::Float64)@interface.jl:41
gradient(::Function, ::Int64)@interface.jl:59
(::Zygote.var"#43#44"{typeof(Main.workspace50.dif)})(::Int64)@interface.jl:62
top-level scope@Local: 1[inlined]

The actual use case is that I'm trying to train a NN where I need to compute the gradient of its parameters w.r.t a loss function that involves finite differences given by central_fdm. Is this possible?

simeonschaub commented 3 years ago

Can you just use it the other way, i.e. FiniteDifferences over Zygote instead of Zygote over FiniteDifferences? You could actually define a custom adjoint for FiniteDifferenceMethod which does this automatically for cases like this.

wesselb commented 3 years ago

Step size adaptation appears to be not working well with Zygote. Unfortunately, I'm not familiar enough with the inner workings of Zygote to see what precisely breaks down. Perhaps @willtebbutt or @oxinabox could provide some insight. From the error message, if appears that a pullback attempts to modify a StaticArray in-place, which won't work.

Once you turn off step size adaptation, things seem to work:

julia> fdm = central_fdm(5, 1; adapt=0);

julia> cos_(x) = fdm(sin, x)
cos_ (generic function with 1 method)

julia> cos_'(1)
-0.8414709848079269

julia> -sin(1)
-0.8414709848078965

I would be careful with AD-ing through finite difference estimates, though. I second @simeonschaub's suggestion of taking finite differences of gradients computed by AD.

EDIT: If we wanted to make this work, we could do something like

using FiniteDifferences, Zygote

function (m::FiniteDifferences.AdaptedFiniteDifferenceMethod)(f::TF, x::Real) where TF<:Function
    x = float(x)  # Assume that converting to float is desired, if it isn't already.
    step = Zygote.dropgrad(first(FiniteDifferences.estimate_step(m, f, x)))
    return m(f, x, step)
end

Then

julia> fdm = central_fdm(5, 1; adapt=1);

julia> cos_(x) = fdm(sin, x)
cos_ (generic function with 1 method)

julia> cos_'(1)
-0.8414709848078701

julia> -sin(1)
-0.8414709848078965