FluxML / Zygote.jl

21st century AD
https://fluxml.ai/Zygote.jl/
Other
1.48k stars 213 forks source link

second-order gradient throws foreigncall error #1271

Open baedan opened 2 years ago

baedan commented 2 years ago

Zygote v0.6.41, Julia 1.7.3 MWE:

using Zygote

α, β = randn(2, 2), randn(2, 2)

g(v) = map(eachcol(v), eachcol(β)) do x, y
           sum(x.*x.*y)
       end |> sum

# this fails
gradient(α) do k
    sum(gradient(g, k)[1])
end

# this works?
gradient(α) do k
    sum(gradient(k) do v
        map(eachcol(v), eachcol(β)) do x, y
            sum(x.*x.*y)
        end |> sum
    end[1])
end
error

  ERROR: Can't differentiate foreigncall expression.
  You might want to check the Zygote limitations documentation.
  https://fluxml.ai/Zygote.jl/dev/limitations.html

  Stacktrace:
    [1] error(s::String)
      @ Base ./error.jl:33
    [2] Pullback
      @ ./iddict.jl:102 [inlined]
    [3] (::typeof(∂(get)))(Δ::Nothing)
      @ Zygote ~/.julia/packages/Zygote/IoW2g/src/compiler/interface2.jl:0
    [4] Pullback
      @ ~/.julia/packages/Zygote/IoW2g/src/lib/lib.jl:68 [inlined]
    [5] (::typeof(∂(accum_global)))(Δ::Nothing)
      @ Zygote ~/.julia/packages/Zygote/IoW2g/src/compiler/interface2.jl:0
    [6] Pullback
      @ ~/.julia/packages/Zygote/IoW2g/src/lib/lib.jl:79 [inlined]
    [7] (::typeof(∂(λ)))(Δ::Nothing)
      @ Zygote ~/.julia/packages/Zygote/IoW2g/src/compiler/interface2.jl:0
    [8] Pullback
      @ ~/.julia/packages/ZygoteRules/AIbCs/src/adjoint.jl:67 [inlined]
    [9] (::typeof(∂(λ)))(Δ::Nothing)
      @ Zygote ~/.julia/packages/Zygote/IoW2g/src/compiler/interface2.jl:0
   [10] Pullback
      @ ./REPL[6]:1 [inlined]
   [11] (::typeof(∂(λ)))(Δ::Tuple{Nothing, FillArrays.Fill{Float64, 2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}})
      @ Zygote ~/.julia/packages/Zygote/IoW2g/src/compiler/interface2.jl:0
   [12] Pullback
      @ ~/.julia/packages/Zygote/IoW2g/src/compiler/interface.jl:41 [inlined]
   [13] (::typeof(∂(λ)))(Δ::Tuple{FillArrays.Fill{Float64, 2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}})
      @ Zygote ~/.julia/packages/Zygote/IoW2g/src/compiler/interface2.jl:0
   [14] Pullback
      @ ~/.julia/packages/Zygote/IoW2g/src/compiler/interface.jl:76 [inlined]
   [15] (::typeof(∂(gradient)))(Δ::Tuple{FillArrays.Fill{Float64, 2, Tuple{Base.OneTo{Int64}, Base.OneTo{Int64}}}})
      @ Zygote ~/.julia/packages/Zygote/IoW2g/src/compiler/interface2.jl:0
   [16] Pullback
      @ ./REPL[7]:2 [inlined]
   [17] (::Zygote.var"#60#61"{typeof(∂(#3))})(Δ::Float64)
      @ Zygote ~/.julia/packages/Zygote/IoW2g/src/compiler/interface.jl:41
   [18] gradient(f::Function, args::Matrix{Float64})
      @ Zygote ~/.julia/packages/Zygote/IoW2g/src/compiler/interface.jl:76
   [19] top-level scope
      @ REPL[7]:1
  

somehow this only works with an anonymous function. am i missing something obvious?

mcabbott commented 2 years ago

The first example works for me if I put it in a let block -- global variables are weird. But the second never seems to. Zygote v0.6.41, Julia nightly.

```julia julia> let α, β = randn(2, 2), randn(2, 2) g(v) = map(eachcol(v), eachcol(β)) do x, y sum(x.*x.*y) end |> sum # this fails gradient(α) do k sum(gradient(g, k)[1]) end end ([-0.8996823616803952 3.0850708182728805; -1.4354663842221136 0.17587965044232523],) julia> let gradient(α) do k sum(gradient(k) do v map(eachcol(v), eachcol(β)) do x, y sum(x.*x.*y) end |> sum end[1]) end end ERROR: Can't differentiate foreigncall expression. (@v1.9) pkg> st Zygote ChainRules ChainRulesCore Status `~/.julia/environments/v1.9/Project.toml` ⌃ [082447d4] ChainRules v1.39.0 [d360d2e6] ChainRulesCore v1.15.3 [e88e6eb3] Zygote v0.6.41 Info Packages marked with ⌃ have new versions available julia> versioninfo() Julia Version 1.9.0-DEV.985 Commit aabd819ae6* (2022-07-13 22:44 UTC) Platform Info: OS: macOS (arm64-apple-darwin21.1.0) CPU: 8 × Apple M1 WORD_SIZE: 64 LIBM: libopenlibm LLVM: libLLVM-14.0.5 (ORCJIT, apple-m1) Threads: 4 on 4 virtual cores Environment: ```
baedan commented 2 years ago

oh, that's super weird. i would've guessed having it in a let block would only make it more likely to work.