SciML / DelayDiffEq.jl

Delay differential equation (DDE) solvers in Julia for the SciML scientific machine learning ecosystem. Covers neutral and retarded delay differential equations, and differential-algebraic equations.
Other
59 stars 26 forks source link

Incorrect jacobian with Zygote + ReverseDiffAdjoint #238

Open devmotion opened 2 years ago

devmotion commented 2 years ago

The following example shows that - similar to the example in the tests - ForwardDiff and finite differencing (here FiniteDifferences, in the tests FiniteDiff) produces similar gradients and Jacobians. However, the Jacobian computed by Zygote differs quite significantly:

using DelayDiffEq
using SciMLSensitivity
using ForwardDiff
using FiniteDifferences
using Zygote

function sensitivity()
    # Define the same LV equation, but including a delay parameter
    function delay_lotka_volterra!(du, u, h, p, t)
        x, y = u
        α, β, δ, γ = p
        du[1] = dx = (α - β * y) * h(p, t - 0.1)[1]
        du[2] = dy = (δ * x - γ) * y
        nothing
    end

    # Initial parameters
    p = [2.2, 1.0, 2.0, 0.4]

    # Define a vector containing delays for each variable (although only the first
    # one is used)
    h(p, t) = ones(2)

    # Initial conditions
    u0 = [1.0, 1.0]

    # Define the problem as a delay differential equation
    prob_dde = DDEProblem(delay_lotka_volterra!, u0, h, (0.0, 1.0),
                          constant_lags = (0.1,))

    function predict_dde(p)
        return Array(solve(prob_dde, MethodOfSteps(Tsit5()),
                           p = p, saveat = 0.1,
                           sensealg = ReverseDiffAdjoint()))
    end

    J1 = ForwardDiff.jacobian(predict_dde, p)
    J2 = Zygote.jacobian(predict_dde, p)[1]
    J3 = FiniteDifferences.jacobian(FiniteDifferences.central_fdm(5, 1), predict_dde, p)[1]

    @show isapprox(J1, J2; rtol=1e-3)
    @show isapprox(J1, J3; rtol=1e-3)
    @show isapprox(J2, J3; rtol=1e-3)

    @show J1[4, 1]
    @show J2[4, 1]
    @show J3[4, 1]

    return J1, J2, J3
end

Output:

julia> sensitivity();
isapprox(J1, J2; rtol = 0.001) = false
isapprox(J1, J3; rtol = 0.001) = true
isapprox(J2, J3; rtol = 0.001) = false
J1[4, 1] = 0.011848062676608925
J2[4, 1] = 0.111469841895221
J3[4, 1] = 0.011848057817131118
ChrisRackauckas commented 2 years ago

Run it at low tolerances?

devmotion commented 2 years ago

Using e.g. abstol=1e-14, reltol=1e-14 doesn't seem to help, the output is the same (naively I would also assume that the tolerance shouldn't affect the approximation of the value at 0.1 too much).

devmotion commented 2 years ago

The problem seems to be ReverseDiffAdjoint. Without it I get:

julia> sensitivity();
isapprox(J1, J2; rtol = 0.001) = true
isapprox(J1, J3; rtol = 0.001) = true
isapprox(J2, J3; rtol = 0.001) = true
J1[4, 1] = 0.011848062676608925
J2[4, 1] = 0.011848062676608925
J3[4, 1] = 0.011848057817131118
ChrisRackauckas commented 2 years ago

This will fallback to ForwardDiffSensitivity, which isn't exactly the same as ForwardDiff on the solver but is fairly close. So yeah it sounds like ReverseDiff is losing something somewhere, maybe due to aliasing.