JuliaDiff / ReverseDiff.jl

Reverse Mode Automatic Differentiation for Julia
Other
348 stars 57 forks source link

Incorrect results when nesting ReverseDiff inside ForwardDiff #168

Open ikirill opened 3 years ago

ikirill commented 3 years ago

I'm not sure if this is even supported, but I think it should throw an error instead of returning zero gradients

The first example is an old bug (#45 ), the second shows you can't nest ReverseDiff inside ForwardDiff, the third shows you can the other way around.

julia> let
           D(f, x) = ReverseDiff.gradient(x->f(x[1]), [x])[1]
           u1 = D(x -> x * D(y -> x * y, 3), 5) # 3
           u2 = D(x -> x * D(y -> y * x, 3), 5) # 5
           @show u1, u2
       end
(u1, u2) = (3, 5)
(3, 5)

julia> let
           Dr(f, x) = ReverseDiff.gradient(x->f(x[1]), [x])[1]
           Df(f, x) = ForwardDiff.gradient(x->f(x[1]), [x])[1]
           u1 = Df(x -> x * Dr(y -> x * y, 3), 5) 
           u2 = Df(x -> x * Dr(y -> y * x, 3), 5) 
           @show u1, u2
       end
(u1, u2) = (0, 0)
(0, 0)

julia> let
           Dr(f, x) = ReverseDiff.gradient(x->f(x[1]), [x])[1]
           Df(f, x) = ForwardDiff.gradient(x->f(x[1]), [x])[1]
           u1 = Dr(x -> x * Df(y -> x * y, 3), 5) 
           u2 = Dr(x -> x * Df(y -> y * x, 3), 5) 
           @show u1, u2
       end
(u1, u2) = (10, 10)
(10, 10)
(test3) pkg> status
Status `~/Sandboxes/Julia-Misc/test3/Project.toml`
  [f6369f11] ForwardDiff v0.10.16 `~/.julia/dev/ForwardDiff`
  [37e2e3b7] ReverseDiff v1.5.0 `~/.julia/dev/ReverseDiff`

julia> versioninfo()
Julia Version 1.5.3
Commit 788b2c77c1 (2020-11-09 13:37 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: AMD Ryzen 7 3700X 8-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-9.0.1 (ORCJIT, znver2)
yangky11 commented 3 years ago

I have a similar issue. This is my test case:

julia> ForwardDiff.derivative(x -> x * ForwardDiff.derivative(y -> x + y, 1), 1)   # OK
1

julia> ReverseDiff.gradient(x -> x[1] * ReverseDiff.gradient(y -> x[1] + y[1], [1]), [1])   # OK
1-element Vector{Int64}:
 1

julia> ReverseDiff.gradient(x -> x[1] * ForwardDiff.derivative(y -> x[1] + y, 1), [1])   # OK
1-element Vector{Int64}:
 1

julia> ForwardDiff.derivative(x -> x * ReverseDiff.gradient(y -> x + y[1], [1]), 1)   #BUG
1-element Vector{Int64}:
 0