EnzymeAD / Enzyme.jl

Julia bindings for the Enzyme automatic differentiator
https://enzyme.mit.edu
MIT License
437 stars 62 forks source link

Enzyme Crashing, MWE. #1653

Open dleather opened 1 month ago

dleather commented 1 month ago

I'm running Enzyme 0.12.23, EnzymeCore 0.7.7, Linear Algebra 1.5.0, Sparse Arrays 1.10.0 in Julia 1.10.4 on a Windows machine. Without fail Enzyme crashes. I can get Zygote to differentiate the function quickly. I've tried replacing all calls to I() with there explicit matrix form, as well as making Λ non-sparse.

using Enzyme, SparseArrays, LinearAlgebra

function compute_L1(Σ::Matrix{T}, Λ::AbstractSparseMatrix{T}) where T <: Real
    # LHS coefficient from Proposition 3.2
    #   L₁ = [Σ ⊗ (Iₙ² + Λₙ)] ⊗ [vec(Iₙ) ⊗ Iₙ]
    N = size(Σ, 1)
    return kron(Σ, I + Λ) * kron(vec(I(N)), I(N))
end

Λ =  spzeros(4, 4)
Λ[1, 1] = 1.0
Λ[3, 2] = 1.0
Λ[2, 3] = 1.0
Λ[4, 4] = 1.0

Δt = 0.25

function f(θ)
    σ_z = θ[1]
    θ_z = θ[2]
    Ω = [sqrt(((σ_z^2)/(2.0 * θ_z))*(1-exp(-2*θ_z*Δt))) 0.0; 0.0 0.0]
    Σ = Ω * Ω'
    L1 = compute_L1(Σ, Λ)
    return L1[1]
end

θ = [1.0, 0.5]
dθ = similar(θ)
f(θ)
Enzyme.autodiff(Reverse, f, Active, Duplicated(θ, dθ))
wsmoses commented 1 month ago

How does Enzyme crash, can you post the log?

wsmoses commented 1 month ago

Also fyi Λ and Δt are type unstable -- which even without an error will make your original code and derivatives slow. You could pass them in as arguments to f, and/or mark the globals as const

dleather commented 1 month ago

How do I see the log. The REPL crashes and I only see message:

The terminal process "C:\Users\davle\AppData\Local\Programs\Julia-1.10.4\bin\julia.exe '-i', '--banner=no', '--project=C:\Users\davle.julia\environments\v1.10', 'c:\Users\davle.vscode\extensions\julialang.language-julia-1.83.2\scripts\terminalserver\terminalserver.jl', '\.\pipe\vsc-jl-repl-7953c62c-ff44-4ad5-9cc4-a14f065b6128', '\.\pipe\vsc-jl-cr-0aea7085-fd31-40a3-81a3-9f8306afd45b', 'USE_REVISE=true', 'USE_PLOTPANE=true', 'USE_PROGRESS=true', 'ENABLE_SHELL_INTEGRATION=true', 'DEBUG_MODE=false'" terminated with exit code: -1073741571.

wsmoses commented 1 month ago

Running on my linux box -- ah I see you have the type unstable vector constructor issue (x/ref https://github.com/EnzymeAD/Enzyme.jl/issues/1134). Julia 1.10 beta3 introduced a change to sparsearrays that causes an infinite recursion here. We should make this handled better, but in the interim this should be resolvable by either making things type stable or not using the array syntactic sugar.

The relevant suggested changes are here:

using Enzyme, SparseArrays, LinearAlgebra

function compute_L1(Σ::Matrix{T}, Λ::AbstractSparseMatrix{T}) where T <: Real
    # LHS coefficient from Proposition 3.2
    #   L₁ = [Σ ⊗ (Iₙ² + Λₙ)] ⊗ [vec(Iₙ) ⊗ Iₙ]
    N = size(Σ, 1)
    return kron(Σ, I + Λ) * kron(vec(I(N)), I(N))
end

Λ =  spzeros(4, 4)
Λ[1, 1] = 1.0
Λ[3, 2] = 1.0
Λ[2, 3] = 1.0
Λ[4, 4] = 1.0

Δt = 0.25

function f(θ, Λ, Δt)
    σ_z = θ[1]
    θ_z = θ[2]
    Ω = zeros(2,2)
    Ω[1,1] = sqrt(((σ_z^2)/(2.0 * θ_z))*(1-exp(-2*θ_z*Δt)))
    Σ = Ω * Ω'
    L1 = compute_L1(Σ, Λ)
    return L1[1]
end

θ = [1.0, 0.5]
dθ = similar(θ)
f(θ)
Enzyme.autodiff(Reverse, f, Active, Duplicated(θ, dθ), Const(Λ), Const(Δt))

This then hits an unrelated oddity in array push/pop, which I can look at later

dleather commented 1 month ago

Really appreciate you looking into it! It's been running for two hours on my machine once I fixed the type stability.

wsmoses commented 1 month ago

Oh really, that shouldn't happen (and isn't what I see)?

I get this on the code I paste above (on my mac laptop):

julia> Enzyme.autodiff(Reverse, f, Active, Duplicated(θ, dθ), Const(Λ2), Const(Δt))
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (true, Vector{Float64}) in {} addrspace(10)* %6
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (false, nothing) in   %getfield23 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(10)* %getfield_addr22 unordered, align 8, !dbg !263, !tbaa !266, !alias.scope !268, !noalias !269, !nonnull !200, !dereferenceable !233, !align !234
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (false, nothing) in   %getfield21 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(10)* %getfield_addr20 unordered, align 8, !dbg !486, !tbaa !266, !alias.scope !268, !noalias !269, !nonnull !200, !dereferenceable !233, !align !234
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (false, nothing) in   %getfield53 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(10)* %getfield_addr52 unordered, align 8, !dbg !464, !tbaa !214, !alias.scope !216, !noalias !265, !nonnull !200, !dereferenceable !231, !align !232
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (false, nothing) in   %getfield57 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(10)* %getfield_addr56 unordered, align 8, !dbg !313, !tbaa !214, !alias.scope !216, !noalias !265, !nonnull !200, !dereferenceable !231, !align !232
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (true, Vector{Float64}) in   %getfield3 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %getfield_addr2 unordered, align 8, !dbg !281, !tbaa !205, !alias.scope !214, !noalias !217, !nonnull !200, !dereferenceable !222, !align !223
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (true, Vector{Float64}) in   %getfield7 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(11)* %getfield_addr6 unordered, align 8, !dbg !258, !tbaa !205, !alias.scope !214, !noalias !217, !nonnull !200, !dereferenceable !222, !align !223
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (false, nothing) in   %getfield71 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(10)* %getfield_addr70 unordered, align 8, !dbg !364, !tbaa !214, !alias.scope !216, !noalias !265, !nonnull !200, !dereferenceable !231, !align !232
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
┌ Warning: TODO reverse jl_array_del_end zero-set used memset rather than runtime type of (false, nothing) in   %getfield76 = load atomic {} addrspace(10)*, {} addrspace(10)* addrspace(10)* %getfield_addr75 unordered, align 8, !dbg !398, !tbaa !214, !alias.scope !216, !noalias !265, !nonnull !200, !dereferenceable !231, !align !232
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:59
ERROR: BoundsError: attempt to access 6-element Vector{Int64} at index [0]
Stacktrace:
  [1] _noshapecheck_map
    @ ./essentials.jl:0
  [2] map
    @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/SparseArrays/src/higherorderfns.jl:1187 [inlined]
  [3] +
    @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/SparseArrays/src/sparsematrix.jl:2242 [inlined]
  [4] +
    @ /Applications/Julia-1.10.app/Contents/Resources/julia/share/julia/stdlib/v1.10/SparseArrays/src/sparsematrix.jl:4277 [inlined]
  [5] compute_L1
    @ ./REPL[93]:5
  [6] f
    @ ./REPL[119]:7 [inlined]
  [7] f
    @ ./REPL[119]:0 [inlined]
  [8] diffejulia_f_10258_inner_1wrap
    @ ./REPL[119]:0
  [9] macro expansion
    @ ~/git/Enzyme.jl/src/compiler.jl:6633 [inlined]
 [10] enzyme_call
    @ ~/git/Enzyme.jl/src/compiler.jl:6233 [inlined]
 [11] CombinedAdjointThunk
    @ ~/git/Enzyme.jl/src/compiler.jl:6110 [inlined]
 [12] autodiff
    @ ~/git/Enzyme.jl/src/Enzyme.jl:314 [inlined]
 [13] autodiff(::ReverseMode{false, FFIABI, false}, ::typeof(f), ::Type{Active}, ::Duplicated{Vector{Float64}}, ::Const{SparseMatrixCSC{Float64, Int64}}, ::Const{Float64})
    @ Enzyme ~/git/Enzyme.jl/src/Enzyme.jl:326
 [14] top-level scope
    @ REPL[123]:1
dleather commented 1 month ago

I'm getting the same error as you with the posted code in maybe 30 seconds. Not sure what is happening with the indexing...