EnzymeAD / Enzyme.jl

Julia bindings for the Enzyme automatic differentiator
https://enzyme.mit.edu
MIT License
455 stars 63 forks source link

illegal insertion UNREACHABLE when calling `gradient` #1101

Closed sefffal closed 11 months ago

sefffal commented 1 year ago

Hello all, and thanks for the promising package.

I'm experimenting with adding Enzyme to my package Octofitter.jl in place of ForwardDiff. Enzyme looks very promising for some of the calls. However, I've hit upon a crash when testing Forward mode on the following function.

I thought this would be a great candidate for Enzyme since the whole call is type-stable and allocation free.

The following is a minimized reproducer:

using Octofitter
using Enzyme

astrom = AstrometryLikelihood(
    (;epoch=10000.0, ra=1000.0, dec=10000.0, σ_ra=2., σ_dec=2.)
)

θ = [
    0.2856978736861535
    0.8847341435477368
    0.13910432450152022
    0.6577331062668983
    0.12889657307964053
    0.32619565913751547
    0.1469804980641587
    0.5803309921596362
    0.11890637163597373
    0.6202103908225023
    0.09897829840794059
    0.7746048679679703
    0.48246261305071714
    0.24070205541561773
    0.4098660116448919
    0.22710010821130477
    0.13272110117111424
    0.31404683226619123
    0.2381845778481425
    0.7083486446249487
    0.7306283738160548
    0.9478148700160369
    0.31419630429898804
]

# Extracted from a RuntimeGenerated function, apologies for the mess.
# This just maps an array into a named tuple.
# Not the source of the error
arr2nt2 = eval(:((arr,)->begin
    #= /Users/thompsonw/.julia/dev/Octofitter/src/variables.jl:464 =#
    #= /Users/thompsonw/.julia/dev/Octofitter/src/variables.jl:465 =#
    l = 23
    #= /Users/thompsonw/.julia/dev/Octofitter/src/variables.jl:466 =#

    #= /Users/thompsonw/.julia/dev/Octofitter/src/variables.jl:470 =#
    sys = (; $(Expr(:(=), :M, :(arr[1]))), $(Expr(:(=), :plx, :(arr[2]))), $(Expr(:(=), :pmra, :(arr[3]))), $(Expr(:(=), :pmdec, :(arr[4]))), $(Expr(:(=), :rv0_1, :(arr[5]))), $(Expr(:(=), :rv0_2, :(arr[6]))), $(Expr(:(=), :rv0_3, :(arr[7]))), $(Expr(:(=), :rv0_4, :(arr[8]))), $(Expr(:(=), :jitter_1, :(arr[9]))), $(Expr(:(=), :jitter_2, :(arr[10]))), $(Expr(:(=), :jitter_3, :(arr[11]))), $(Expr(:(=), :jitter_4, :(arr[12]))), $(Expr(:(=), :rv0_5, :(arr[13]))), $(Expr(:(=), :jitter_5, :(arr[14]))), $(Expr(:(=), :rv0_6, :(arr[15]))), $(Expr(:(=), :jitter_6, :(arr[16]))))
    #= /Users/thompsonw/.julia/dev/Octofitter/src/variables.jl:472 =#
    #= /Users/thompsonw/.julia/dev/Octofitter/src/variables.jl:474 =#
    pln = (; $(Expr(:(=), :b, quote
    #= /Users/thompsonw/.julia/dev/Octofitter/src/variables.jl:453 =#
    planet = (; $(Expr(:(=), :a, :(arr[17]))), $(Expr(:(=), :τy, :(arr[18]))), $(Expr(:(=), :τx, :(arr[19]))), $(Expr(:(=), :mass, :(arr[20]))), $(Expr(:(=), :i, :(arr[21]))), $(Expr(:(=), :Ωy, :(arr[22]))), $(Expr(:(=), :Ωx, :(arr[23]))))
    #= /Users/thompsonw/.julia/dev/Octofitter/src/variables.jl:454 =#
    planet = (; planet..., e = 0.)
    planet = (; planet..., ω = 0.)
    planet = (; planet..., τ = atan(arr[18],arr[19])/2pi)
    planet = (; planet..., Ω = atan(arr[22],arr[23]))
    #= /Users/thompsonw/.julia/dev/Octofitter/src/variables.jl:455 =#
    planet
end)))
          #= /Users/thompsonw/.julia/dev/Octofitter/src/variables.jl:476 =#
          sys_res_pln = (; sys..., planets = pln)
          #= /Users/thompsonw/.julia/dev/Octofitter/src/variables.jl:477 =#
          return sys_res_pln
end))

# Octofitter.ln_like(::Astrometry, ....) is the source of the error. See source code https://github.com/sefffal/Octofitter.jl/blob/2535660a2ea0c47319ceee9d26bdff2bc86ec6d0/src/likelihoods/relative-astrometry.jl#L169
f(θ) = Octofitter.ln_like(astrom, arr2nt2(θ), orbit(;arr2nt2(θ)...,arr2nt2(θ).planets.b...))

Enzyme.gradient(Forward, f, θ)

#==

inserting into : {[1]:Integer, [2]:Integer, [3]:Integer, [4]:Integer, [5]:Integer, [6]:Integer, [7]:Integer} with [-1] of Float@double
illegal insertion
UNREACHABLE executed at /workspace/srcdir/Enzyme/enzyme/Enzyme/TypeAnalysis/TypeTree.h:276!

[74302] signal (6): Abort trap: 6
in expression starting at REPL[51]:1
__pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line)
Allocations: 60989671 (Pool: 60927753; Big: 61918); GC: 87
zsh: abort      julia

==#

Tested on MacOS (m2), Julia 1.9.3, and Enzyme v0.11.9.

In case it helps, Reverse results in this message instead:

Enzyme.gradient(Reverse, f, θ)
ERROR: Duplicated Returns not yet handled
Stacktrace:
 [1] autodiff
   @ ~/.julia/packages/Enzyme/5wFGb/src/Enzyme.jl:205 [inlined]
 [2] autodiff(mode::ReverseMode{false, FFIABI}, f::Const{typeof(f)}, args::Duplicated{Vector{Float64}})
   @ Enzyme ~/.julia/packages/Enzyme/5wFGb/src/Enzyme.jl:236
 [3] autodiff
   @ ~/.julia/packages/Enzyme/5wFGb/src/Enzyme.jl:222 [inlined]
 [4] gradient(#unused#::ReverseMode{false, FFIABI}, f::Function, x::Vector{Float64})
   @ Enzyme ~/.julia/packages/Enzyme/5wFGb/src/Enzyme.jl:811
 [5] top-level scope
   @ REPL[9]:1
sefffal commented 1 year ago

Actually, my MWE might have introduced a type, instability. I will update.

sefffal commented 1 year ago

I solved my issue by fixing the type instability, but I'll leave this open for the Julia crash instead of an error message.

wsmoses commented 11 months ago

Improved error message should be added by latest jll bump https://github.com/EnzymeAD/Enzyme.jl/commit/38007ced021e021a25982b7e16a65686ead0dcf3