EnzymeAD / Enzyme.jl

Julia bindings for the Enzyme automatic differentiator
https://enzyme.mit.edu
MIT License
443 stars 62 forks source link

@threads breaks Enzyme? #1266

Closed gdalle closed 8 months ago

gdalle commented 8 months ago

I just found out that simply adding @threads to a mutating loop may break Enzyme, with the following warning:

ERROR: Enzyme execution failed.
Mismatched activity...
You may be using a constant variable as temporary storage for active memory (https://enzyme.mit.edu/julia/stable/#Activity-of-temporary-storage).
If not, please open an issue, and either rewrite this variable to not be conditionally active or use Enzyme.API.runtimeActivity!(true) as a workaround for now

EDIT: better MWE below in https://github.com/EnzymeAD/Enzyme.jl/issues/1266#issuecomment-1925314145

For context, the goal is to compute and differentiate the loglikelihood of a Hidden Markov Model on several sequences, with respect to its parameters. The procedure is embarrassingly parallel over sequences, which justifies multithreading. Here's an MWE: **Setup** ```julia using Pkg Pkg.activate(temp=true) Pkg.add("Distributions") Pkg.add("Enzyme") Pkg.add(url="https://github.com/gdalle/HiddenMarkovModels.jl", rev="8bf278a") using Base.Threads, Distributions, Enzyme, HiddenMarkovModels hmm = HMM([0.8, 0.2], [0.7 0.3; 0.3 0.7], Normal.([-1.0, 1.0])) obs_seq = randn(10) control_seq = fill(nothing, length(obs_seq)) seq_ends = [length(obs_seq)] function loglikelihood!(storage, hmm, obs_seq, control_seq, seq_ends) for k in eachindex(seq_ends) t1, t2 = seq_limits(seq_ends, k) storage.logL[k] = HiddenMarkovModels.forward!(storage, hmm, obs_seq, control_seq, t1, t2) end return sum(storage.logL) end function loglikelihood_threaded!(storage, hmm, obs_seq, control_seq, seq_ends) @threads for k in eachindex(seq_ends) t1, t2 = seq_limits(seq_ends, k) storage.logL[k] = HiddenMarkovModels.forward!(storage, hmm, obs_seq, control_seq, t1, t2) end return sum(storage.logL) end storage = HiddenMarkovModels.initialize_forward(hmm, obs_seq, control_seq; seq_ends) ``` **Results** ```julia julia> loglikelihood!(storage, hmm, obs_seq, control_seq, seq_ends) -13.827674763015526 julia> loglikelihood_threaded!(storage, hmm, obs_seq, control_seq, seq_ends) -13.827674763015526 julia> Enzyme.autodiff( Reverse, loglikelihood!, Active, Duplicated(storage, Enzyme.make_zero(storage)), Duplicated(hmm, Enzyme.make_zero(hmm)), Const(obs_seq), Const(control_seq), Const(seq_ends), ) ┌ Warning: Using fallback BLAS replacements, performance may be degraded └ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/U36Ed/src/utils.jl:59 ((nothing, nothing, nothing, nothing, nothing),) julia> Enzyme.autodiff( Reverse, loglikelihood_threaded!, Active, Duplicated(storage, Enzyme.make_zero(storage)), Duplicated(hmm, Enzyme.make_zero(hmm)), Const(obs_seq), Const(control_seq), Const(seq_ends), ) ┌ Warning: Using fallback BLAS replacements, performance may be degraded └ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/U36Ed/src/utils.jl:59 ERROR: Enzyme execution failed. Mismatched activity for: store {} addrspace(10)* %2, {} addrspace(10)** %.fca.0.2.gep, align 8, !dbg !268, !noalias !269 const val: {} addrspace(10)* %2 Type tree: {[-1]:Pointer, [-1,0]:Pointer, [-1,0,-1]:Float@double, [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer} You may be using a constant variable as temporary storage for active memory (https://enzyme.mit.edu/julia/stable/#Activity-of-temporary-storage). If not, please open an issue, and either rewrite this variable to not be conditionally active or use Enzyme.API.runtimeActivity!(true) as a workaround for now Stacktrace: [1] macro expansion @ ./threadingconstructs.jl:219 [2] loglikelihood_threaded! @ ~/Downloads/enzyme_threads.jl:25 [3] loglikelihood_threaded! @ ~/Downloads/enzyme_threads.jl:0 Stacktrace: [1] throwerr(cstr::Cstring) @ Enzyme.Compiler ~/.julia/packages/Enzyme/KJgKj/src/compiler.jl:1319 [2] macro expansion @ ./threadingconstructs.jl:219 [inlined] [3] loglikelihood_threaded! @ ~/Downloads/enzyme_threads.jl:25 [inlined] [4] loglikelihood_threaded! @ ~/Downloads/enzyme_threads.jl:0 [inlined] [5] diffejulia_loglikelihood_threaded__9908_inner_1wrap @ ~/Downloads/enzyme_threads.jl:0 [6] macro expansion @ Enzyme.Compiler ~/.julia/packages/Enzyme/KJgKj/src/compiler.jl:5310 [inlined] [7] enzyme_call(::Val{…}, ::Ptr{…}, ::Type{…}, ::Type{…}, ::Val{…}, ::Type{…}, ::Type{…}, ::Const{…}, ::Type{…}, ::Duplicated{…}, ::Duplicated{…}, ::Const{…}, ::Const{…}, ::Const{…}, ::Float64) @ Enzyme.Compiler ~/.julia/packages/Enzyme/KJgKj/src/compiler.jl:4988 [8] (::Enzyme.Compiler.CombinedAdjointThunk{…})(::Const{…}, ::Duplicated{…}, ::Vararg{…}) @ Enzyme.Compiler ~/.julia/packages/Enzyme/KJgKj/src/compiler.jl:4930 [9] autodiff(::ReverseMode{…}, ::Const{…}, ::Type{…}, ::Duplicated{…}, ::Vararg{…}) @ Enzyme ~/.julia/packages/Enzyme/KJgKj/src/Enzyme.jl:215 [10] autodiff(::ReverseMode{…}, ::typeof(loglikelihood_threaded!), ::Type, ::Duplicated{…}, ::Vararg{…}) @ Enzyme ~/.julia/packages/Enzyme/KJgKj/src/Enzyme.jl:224 [11] top-level scope @ ~/Downloads/enzyme_threads.jl:48 Some type information was truncated. Use `show(err)` to see complete types. ```
wsmoses commented 8 months ago

Are you able to update the MWE to have the threads in question in the snippet, and ideally without the external package?

On Sat, Feb 3, 2024, 9:37 AM Guillaume Dalle @.***> wrote:

I just found out that simply adding @threads to a mutating loop may break Enzyme, with the following warning:

ERROR: Enzyme execution failed. Mismatched activity... You may be using a constant variable as temporary storage for active memory (https://enzyme.mit.edu/julia/stable/#Activity-of-temporary-storage). If not, please open an issue, and either rewrite this variable to not be conditionally active or use Enzyme.API.runtimeActivity!(true) as a workaround for now

For context, the goal is to compute and differentiate the loglikelihood of a Hidden Markov Model on several sequences, with respect to its parameters. The procedure is embarrassingly parallel over sequences, which justifies multithreading. Here's an MWE:

Setup

using Pkg Pkg.activate(temp=true)

Pkg.add("Distributions") Pkg.add("Enzyme") Pkg.add(url="https://github.com/gdalle/HiddenMarkovModels.jl", rev="8bf278a") using Base.Threads, Distributions, Enzyme, HiddenMarkovModels

hmm = HMM([0.8, 0.2], [0.7 0.3; 0.3 0.7], Normal.([-1.0, 1.0]))

obs_seq = randn(10) control_seq = fill(nothing, length(obs_seq)) seq_ends = [length(obs_seq)] function loglikelihood!(storage, hmm, obs_seq, control_seq, seq_ends) for k in eachindex(seq_ends) t1, t2 = seq_limits(seq_ends, k) storage.logL[k] = HiddenMarkovModels.forward!(storage, hmm, obs_seq, control_seq, t1, t2) end return sum(storage.logL)end function loglikelihood_threaded!(storage, hmm, obs_seq, control_seq, seq_ends) @threads for k in eachindex(seq_ends) t1, t2 = seq_limits(seq_ends, k) storage.logL[k] = HiddenMarkovModels.forward!(storage, hmm, obs_seq, control_seq, t1, t2) end return sum(storage.logL)end

storage = HiddenMarkovModels.initialize_forward(hmm, obs_seq, control_seq; seq_ends)

Results

julia> loglikelihood!(storage, hmm, obs_seq, control_seq, seq_ends)-13.827674763015526

julia> loglikelihood_threaded!(storage, hmm, obs_seq, control_seq, seq_ends)-13.827674763015526

julia> Enzyme.autodiff( Reverse, loglikelihood!, Active, Duplicated(storage, Enzyme.make_zero(storage)), Duplicated(hmm, Enzyme.make_zero(hmm)), Const(obs_seq), Const(control_seq), Const(seq_ends), ) ┌ Warning: Using fallback BLAS replacements, performance may be degraded └ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/U36Ed/src/utils.jl:59 ((nothing, nothing, nothing, nothing, nothing),)

julia> Enzyme.autodiff( Reverse, loglikelihood_threaded!, Active, Duplicated(storage, Enzyme.make_zero(storage)), Duplicated(hmm, Enzyme.make_zero(hmm)), Const(obs_seq), Const(control_seq), Const(seq_ends), ) ┌ Warning: Using fallback BLAS replacements, performance may be degraded └ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/U36Ed/src/utils.jl:59 ERROR: Enzyme execution failed. Mismatched activity for: store {} addrspace(10)* %2, {} addrspace(10) %.fca.0.2.gep, align 8, !dbg !268, !noalias !269 const val: {} addrspace(10) %2 Type tree: {[-1]:Pointer, [-1,0]:Pointer, @., [-1,8]:Integer, [-1,9]:Integer, [-1,10]:Integer, [-1,11]:Integer, [-1,12]:Integer, [-1,13]:Integer, [-1,14]:Integer, [-1,15]:Integer, [-1,16]:Integer, [-1,17]:Integer, [-1,18]:Integer, [-1,19]:Integer, [-1,20]:Integer, [-1,21]:Integer, [-1,22]:Integer, [-1,23]:Integer, [-1,24]:Integer, [-1,25]:Integer, [-1,26]:Integer, [-1,27]:Integer, [-1,28]:Integer, [-1,29]:Integer, [-1,30]:Integer, [-1,31]:Integer, [-1,32]:Integer, [-1,33]:Integer, [-1,34]:Integer, [-1,35]:Integer, [-1,36]:Integer, [-1,37]:Integer, [-1,38]:Integer, [-1,39]:Integer} You may be using a constant variable as temporary storage for active memory (https://enzyme.mit.edu/julia/stable/#Activity-of-temporary-storage). If not, please open an issue, and either rewrite this variable to not be conditionally active or use Enzyme.API.runtimeActivity!(true) as a workaround for now

Stacktrace: [1] macro expansion @ ./threadingconstructs.jl:219 [2] loglikelihood_threaded! @ ~/Downloads/enzyme_threads.jl:25 [3] loglikelihood_threaded! @ ~/Downloads/enzyme_threads.jl:0

Stacktrace: [1] throwerr(cstr::Cstring) @ Enzyme.Compiler ~/.julia/packages/Enzyme/KJgKj/src/compiler.jl:1319 [2] macro expansion @ ./threadingconstructs.jl:219 [inlined] [3] loglikelihood_threaded! @ ~/Downloads/enzyme_threads.jl:25 [inlined] [4] loglikelihood_threaded! @ ~/Downloads/enzyme_threads.jl:0 [inlined] [5] diffejulia_loglikelihood_threaded__9908_inner_1wrap @ ~/Downloads/enzyme_threads.jl:0 [6] macro expansion @ Enzyme.Compiler ~/.julia/packages/Enzyme/KJgKj/src/compiler.jl:5310 [inlined] [7] enzyme_call(::Val{…}, ::Ptr{…}, ::Type{…}, ::Type{…}, ::Val{…}, ::Type{…}, ::Type{…}, ::Const{…}, ::Type{…}, ::Duplicated{…}, ::Duplicated{…}, ::Const{…}, ::Const{…}, ::Const{…}, ::Float64) @ Enzyme.Compiler ~/.julia/packages/Enzyme/KJgKj/src/compiler.jl:4988 [8] (::Enzyme.Compiler.CombinedAdjointThunk{…})(::Const{…}, ::Duplicated{…}, ::Vararg{…}) @ Enzyme.Compiler ~/.julia/packages/Enzyme/KJgKj/src/compiler.jl:4930 [9] autodiff(::ReverseMode{…}, ::Const{…}, ::Type{…}, ::Duplicated{…}, ::Vararg{…}) @ Enzyme ~/.julia/packages/Enzyme/KJgKj/src/Enzyme.jl:215 [10] autodiff(::ReverseMode{…}, ::typeof(loglikelihood_threaded!), ::Type, ::Duplicated{…}, ::Vararg{…}) @ Enzyme ~/.julia/packages/Enzyme/KJgKj/src/Enzyme.jl:224 [11] top-level scope @ ~/Downloads/enzyme_threads.jl:48 Some type information was truncated. Use show(err) to see complete types.

— Reply to this email directly, view it on GitHub https://github.com/EnzymeAD/Enzyme.jl/issues/1266, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTUXAZIY5WO3MP6ULU2WTYRXZLNAVCNFSM6AAAAABCXZ6Z3WVHI2DSMVQWIX3LMV43ASLTON2WKOZSGEYTMMZVGYYDGMA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

gdalle commented 8 months ago

The threads are in the snippet, everything else I call is single-threaded. Removing the external package is a bit more work, I tried but ran into another enzyme issue

wsmoses commented 8 months ago

The other issue would also be good to identify/isolate.

gdalle commented 8 months ago

My bad the other issue was on my end. I fixed it but didn't succeed in reproducing the bug with a simpler setup. That being said, the MWE above is fully reproducible and showcases the impact of adding / removing the @threads.

wsmoses commented 8 months ago

Can you provide your Enzyme and Julia version. From the looks of it you may not be in the latest release since I see the BLAS warning. I wouldn't expect updating to resolve this, but worth checking first

gdalle commented 8 months ago

Hold on I got one:


using Base.Threads, Enzyme

indices = [1]
storage = rand(3)

function f!(storage, indices)
    for k in indices
        storage[k] = 0.0
    end
    return sum(storage)
end

function f_threaded!(storage, indices)
    @threads for k in indices
        storage[k] = 0.0
    end
    return sum(storage)
end

# this works
autodiff(
    Reverse,
    f!,
    Active,
    Duplicated(storage, make_zero(storage)),
    Const(indices),
)

# this fails
autodiff(
    Reverse,
    f_threaded!,
    Active,
    Duplicated(storage, make_zero(storage)),
    Const(indices),
)

# this works again
autodiff(
    Reverse,
    f_threaded!,
    Active,
    Duplicated(storage, make_zero(storage)),
    Duplicated(indices, make_zero(indices)),
)

For some reason, the @threads macro seems to make Enzyme believe the indices argument must be differentiated, even though they are integers and marked Const

gdalle commented 8 months ago

I'm on Enzyme v0.11.4 and here's my Julia

julia> versioninfo()
Julia Version 1.10.0
Commit 3120989f39b (2023-12-25 18:01 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: macOS (x86_64-apple-darwin22.4.0)
  CPU: 4 × Intel(R) Core(TM) i7-7567U CPU @ 3.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, skylake)
  Threads: 5 on 4 virtual cores
Environment:
  JULIA_EDITOR = code
wsmoses commented 8 months ago

Yeah that's a really old version of Enzyme

wsmoses commented 8 months ago

And @gdalle the issue is that indices is type unstable. What if you had it as const indices = ...

gdalle commented 8 months ago

Sorry bad copy-paste I'm on v0.11.14, the latest one

wsmoses commented 8 months ago

In your original examples the same applies to seq_ends

wsmoses commented 8 months ago

Oh never mind the global shadows a parameter of the same name (sorry am on mobile,)

gdalle commented 8 months ago

If it helps, same behavior here:


using Base.Threads, Enzyme

function f!(sto, ind)
    for k in ind
        sto[k] = 0.0
    end
    return sum(sto)
end

function f_threaded!(sto, ind)
    @threads for k in ind
        sto[k] = 0.0
    end
    return sum(sto)
end

indices = [1]
storage = rand(2)

# this works
autodiff(Reverse, f!, Active, Duplicated(storage, make_zero(storage)), Const(indices))

# this fails
autodiff(
    Reverse, f_threaded!, Active, Duplicated(storage, make_zero(storage)), Const(indices)
)

# this works again
autodiff(
    Reverse,
    f_threaded!,
    Active,
    Duplicated(storage, make_zero(storage)),
    Duplicated(indices, make_zero(indices)),
)
gdalle commented 8 months ago

By the way there's no rush of any kind, you don't have to solve every problem on earth from your phone ;) I already really appreciate your ever present and helpful attitude!

wsmoses commented 8 months ago

I believe this should be fixed by https://github.com/EnzymeAD/Enzyme.jl/pull/1212 . Please reopen if it persists.

gdalle commented 7 months ago

Indeed this is fixed on main, would you be so kind as to tag a release?

wsmoses commented 7 months ago

Hopefully soon, I want to have us land solutions for https://github.com/EnzymeAD/Enzyme.jl/issues/1269 and a couple of other things first.

Our release cadence is explicitly a bit slower than some other Julia packages (which may do every commit), partially because fixing some input codes may break or mark as unsupported others. We want to generally increase the coverage of julia functions on every release, so sometimes we will delay if a fix for some code can cause issue for another code and/or throw an unsupported error.

wsmoses commented 7 months ago

Released