Closed Moelf closed 2 years ago
What version/commit of Enzyme / Enzyme_jll are you on? You look to be using a combo that has a mismatch?
I did ]add Enzyme#main
[7da242da] Enzyme v0.9.4
[7cc45869] Enzyme_jll v0.0.30+1
Yeah #main requires a custom jll right now and will not work on the previous jll. Can you use the latest release?
ok it doesn't crash anymore but still seems to be giving wrong result:
julia> g! = (dx, αs) -> autodiff(f, Duplicated(αs, dx))
#1 (generic function with 1 method)
julia> optimize(f, g!, ones(2), LBFGS(); inplace=true) |> Optim.minimizer
2-element Vector{Float64}:
1.0
1.0
julia> g = let dx = zeros(2)
αs -> (autodiff(f, Duplicated(αs, dx)); dx)
end
#3 (generic function with 1 method)
julia> optimize(f, g, ones(2), LBFGS(); inplace=false) |> Optim.minimizer
┌ Warning: Failed to achieve finite new evaluation point, using alpha=0
└ @ LineSearches ~/.julia/packages/LineSearches/Ki4c5/src/hagerzhang.jl:148
2-element Vector{Float64}:
1.1493167686587399e8
4.689119509344906
Can you make a version of this without Optim and just has an input to autodiff, and an incorrect result?
Note that in reverse mode, the duplicated will += the derivative into the shadow -- if your shadow is not already zero'd.
ok actually, my original problem is this:
julia> using LiteHF, Enzyme
julia> RR = @time build_pyhf(load_pyhfjson("/home/akako/.julia/dev/LiteHF/test/pyhfjson/sample.json"));
6.642286 seconds (22.35 M allocations: 1006.076 MiB, 5.98% gc time, 99.87% compilation time)
julia> LL(x) = -RR.LogLikelihood(x)
LL (generic function with 1 method)
julia> g = let dx = zeros(2)
αs -> (autodiff(LL, Duplicated(αs, dx)); dx)
end
julia> RR.prior_inits
2-element Vector{Float64}:
1.0
0.0
julia> g(RR.prior_inits)
ERROR: MethodError: no method matching unsafe_convert(::Type{Ptr{LLVM.API.LLVMOpaqueType}}, ::Nothing)
Closest candidates are:
unsafe_convert(::Type{Ptr{T}}, ::SharedArrays.SharedArray{T}) where T at /usr/share/julia/stdlib/v1.7/SharedArrays/src/SharedArrays.jl:361
unsafe_convert(::Type{Ptr{T}}, ::SharedArrays.SharedArray) where T at /usr/share/julia/stdlib/v1.7/SharedArrays/src/SharedArrays.jl:362
unsafe_convert(::Type{Ptr{T}}, ::Adjoint{<:Real, <:AbstractVecOrMat}) where T at /usr/share/julia/stdlib/v1.7/LinearAlgebra/src/adjtrans.jl:197
...
Stacktrace:
[1] EnzymeGradientUtilsSubTransferHelper(gutils::Ptr{Nothing}, mode::Enzyme.API.CDerivativeMode, secretty::Nothing, intrinsic::UInt32, dstAlign::Int64, srcAlign::Int64, offset::Int64, dstConstant::Bool, origdst::LLVM.LoadInst, srcConstant::Bool, origsrc::LLVM.LoadInst, length::LLVM.MulInst, isVolatile::LLVM.ConstantInt, MTI::LLVM.CallInst, allowForward::Bool, shadowsLookedUp::Bool)
LiteHF: https://github.com/JuliaHEP/LiteHF.jl
I didn't find a way to reduce it, so I'm just gonna post it here
@wsmoses I recall you looked at this briefly, before you started travelling?
@Moelf can you test #308? And it would be great to have a reproducer that doesn't require an external file.
build_pyhf(load_pyhfjson(joinpath(dirname(pathof(LiteHF)), "..", "test/pyhfjson/sample.json")));
ERROR: SystemError: opening file "/home/vchuravy/.julia/packages/LiteHF/Vk433/src/../test/pyhfjson/sample.json": No such file or directory
Also note. You probably want const RR =
or pass in RR
as an argument to the function LL to avoid the type instability
I don't have the custom _jll
I think?
sorry about the file, can your try using a different file like multi_channel.json
?
You don't need the custom jll anymore :)
well
ERROR: Unsatisfiable requirements detected for package Enzyme_jll [7cc45869]:
Enzyme_jll [7cc45869] log:
├─possible versions are: 0.0.1-0.0.30 or uninstalled
└─restricted to versions 0.0.31 by Enzyme [7da242da] — no versions left
└─Enzyme [7da242da] log:
That means your registry is outdated... https://github.com/JuliaRegistries/General/commit/d37800ef9c014e970f8f9f2b6f7d58b4cb128ef4 that was registered two days ago.
still getting the same error
julia> using LiteHF, Enzyme
julia> const RR = @time build_pyhf(load_pyhfjson("/home/akako/.julia/dev/LiteHF/test/pyhfjson/multi_channel.json"));
3.421716 seconds (7.80 M allocations: 406.006 MiB, 7.53% gc time, 99.66% compilation time)
julia> LL(x) = -RR.LogLikelihood(x)
LL (generic function with 1 method)
julia> g = let dx = zeros(2)
αs -> (autodiff(LL, Duplicated(αs, dx)); dx)
end
#1 (generic function with 1 method)
julia> g(RR.inits)
ERROR: MethodError: no method matching unsafe_convert(::Type{Ptr{LLVM.API.LLVMOpaqueType}}, ::Nothing)
Closest candidates are:
unsafe_convert(::Type{Ptr{T}}, ::StaticArrays.SizedArray) where T at ~/.julia/packages/StaticArrays/58yy1/src/SizedArray.jl:127
unsafe_convert(::Type{Ptr{T}}, ::LinearAlgebra.Transpose{<:Any, <:AbstractVecOrMat}) where T at /usr/share/julia/stdlib/v1.8/LinearAlgebra/src/adjtrans.jl:199
unsafe_convert(::Type{Ptr{T}}, ::Base.RefValue{SA}) where {S, T, D, L, SA<:StaticArrays.SArray{S, T, D, L}} at ~/.julia/packages/StaticArrays/58yy1/src/SArray.jl:125
...
Stacktrace:
[1] EnzymeGradientUtilsSubTransferHelper(gutils::Ptr{Nothing}, mode::Enzyme.API.CDerivativeMode, secretty::Nothing, intrinsic::UInt32, dstAlign::Int64, srcAlign::Int64, offset::Int64, dstConstant::Bool, origdst::LLVM.LoadInst, srcConstant::Bool, origsrc::LLVM.LoadInst, length::LLVM.MulInst, isVolatile::LLVM.ConstantInt, MTI::LLVM.CallInst, allowForward::Bool, shadowsLookedUp::Bool)
@ Enzyme.API ~/.julia/dev/Enzyme/src/api.jl:206
What's the full backtrace?
I am now getting:
%"'ip_phi5" = phi {} addrspace(10)*
julia: /workspace/srcdir/Enzyme/enzyme/Enzyme/CacheUtility.cpp:76: virtual void CacheUtility::erase(llvm::Instruction*): Assertion `I->use_empty()' failed.
signal (6): Aborted
in expression starting at REPL[10]:1
__pthread_kill_implementation at /usr/bin/../lib/libc.so.6 (unknown line)
raise at /usr/bin/../lib/libc.so.6 (unknown line)
abort at /usr/bin/../lib/libc.so.6 (unknown line)
__assert_fail_base.cold at /usr/bin/../lib/libc.so.6 (unknown line)
__assert_fail at /usr/bin/../lib/libc.so.6 (unknown line)
erase at /workspace/srcdir/Enzyme/enzyme/Enzyme/CacheUtility.cpp:76
erase at /workspace/srcdir/Enzyme/enzyme/Enzyme/GradientUtils.h:1022
visitCallInst at /workspace/srcdir/Enzyme/enzyme/Enzyme/AdjointGenerator.h:8338
delegateCallInst at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/InstVisitor.h:302 [inlined]
visitCall at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/Instruction.def:209 [inlined]
visit at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/Instruction.def:209
visit at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/InstVisitor.h:112 [inlined]
CreateAugmentedPrimal at /workspace/srcdir/Enzyme/enzyme/Enzyme/EnzymeLogic.cpp:2017
visitCallInst at /workspace/srcdir/Enzyme/enzyme/Enzyme/AdjointGenerator.h:11238
delegateCallInst at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/InstVisitor.h:302 [inlined]
visitCall at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/Instruction.def:209 [inlined]
visit at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/Instruction.def:209
visit at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/InstVisitor.h:112 [inlined]
CreateAugmentedPrimal at /workspace/srcdir/Enzyme/enzyme/Enzyme/EnzymeLogic.cpp:2017
visitCallInst at /workspace/srcdir/Enzyme/enzyme/Enzyme/AdjointGenerator.h:11238
delegateCallInst at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/InstVisitor.h:302 [inlined]
visitCall at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/Instruction.def:209 [inlined]
visit at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/Instruction.def:209
visit at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/InstVisitor.h:112 [inlined]
CreateAugmentedPrimal at /workspace/srcdir/Enzyme/enzyme/Enzyme/EnzymeLogic.cpp:2017
visitCallInst at /workspace/srcdir/Enzyme/enzyme/Enzyme/AdjointGenerator.h:11238
delegateCallInst at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/InstVisitor.h:302 [inlined]
visitCall at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/Instruction.def:209 [inlined]
visit at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/Instruction.def:209
visit at /opt/x86_64-linux-gnu/x86_64-linux-gnu/sys-root/usr/local/include/llvm/IR/InstVisitor.h:112 [inlined]
CreatePrimalAndGradient at /workspace/srcdir/Enzyme/enzyme/Enzyme/EnzymeLogic.cpp:3656
EnzymeCreatePrimalAndGradient at /workspace/srcdir/Enzyme/enzyme/Enzyme/CApi.cpp:438
EnzymeCreatePrimalAndGradient at /home/vchuravy/src/Enzyme/src/api.jl:111
enzyme! at /home/vchuravy/src/Enzyme/src/compiler.jl:3162
Which is defined as progress :)
] add https://github.com/JuliaHEP/LiteHF.jl
] add Enzyme#main
using LiteHF, Enzyme
const RR = build_pyhf(load_pyhfjson(joinpath(dirname(pathof(LiteHF)), "..", "test/pyhfjson/multi_channel.json")));
julia> LL(x) = -RR.LogLikelihood(x)
LL (generic function with 1 method)
julia> g = let dx = zeros(2)
αs -> (autodiff(LL, Duplicated(αs, dx)); dx)
end
julia> g(RR.inits)
that was on #308, now I basically get the same thing, except it overruns my terminal buffer.....
On latest main and jll, I get the following:
using Optim, Enzyme, LinearAlgebra
function f(x)
y1 = zeros(eltype(x), 3)
y2 = ones(eltype(x), 3)
y1 .+= (3 - sin(x[1]))^2
y2 .+= (x[2] - 3)^4
dot(y1, y2)
end
@show optimize(f, ones(2), NelderMead()) |> Optim.minimizer
@show optimize(f, ones(2), LBFGS()) |> Optim.minimizer
@show optimize(f, ones(2), LBFGS(); autodiff=:forward) |> Optim.minimizer
g! = (dx, αs) -> autodiff(f, Duplicated(αs, dx))
@show optimize(f, g!, ones(2), LBFGS(); inplace=true) |> Optim.minimizer
function g_fix!(dx, as)
dx .= 0
autodiff(f, Duplicated(as, dx))
end
@show optimize(f, g_fix!, ones(2), LBFGS(); inplace=true) |> Optim.minimizer
g = let dx = zeros(2)
αs -> (autodiff(f, Duplicated(αs, dx)); dx)
end
@show optimize(f, g, ones(2), LBFGS(); inplace=false) |> Optim.minimizer
function g_fix(as)
dx = zeros(2)
autodiff(f, Duplicated(as, dx))
dx
end
@show optimize(f, g_fix, ones(2), LBFGS(); inplace=false) |> Optim.minimizer
wmoses@beast:~/git/Enzyme.jl (rand) $ ./julia-1.7.2/bin/julia --project what.jl
optimize(f, ones(2), NelderMead()) |> Optim.minimizer = [1.5707884242800443, 2.99892521883371]
optimize(f, ones(2), LBFGS()) |> Optim.minimizer = [1.5707963268056853, 3.000075121117243]
optimize(f, ones(2), LBFGS(); autodiff = :forward) |> Optim.minimizer = [1.5707963270758434, 3.0000354995684293]
warning: Linking two modules of different target triples: 'bcloader' is 'x86_64-unknown-linux-gnu' whereas 'text' is 'x86_64-pc-linux-gnu'
warning: Linking two modules of different target triples: 'bcloader' is 'x86_64-unknown-linux-gnu' whereas 'text' is 'x86_64-pc-linux-gnu'
┌ Warning: Using fallback BLAS replacements, performance may be degraded
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/XyxTy/src/utils.jl:35
optimize(f, g!, ones(2), LBFGS(); inplace = true) |> Optim.minimizer = [1.0, 1.0]
optimize(f, g_fix!, ones(2), LBFGS(); inplace = true) |> Optim.minimizer = [1.5707963270758434, 3.000035499568429]
┌ Warning: Failed to achieve finite new evaluation point, using alpha=0
└ @ LineSearches ~/.julia/packages/LineSearches/Ki4c5/src/hagerzhang.jl:148
optimize(f, g, ones(2), LBFGS(); inplace = false) |> Optim.minimizer = [1.1493167686587399e8, 4.689119509344906]
optimize(f, g_fix, ones(2), LBFGS(); inplace = false) |> Optim.minimizer = [1.5707963270758434, 3.000035499568429]
Namely the g_fix versions I wrote appear to succeed. I'm not sure why these alternate versions fail, perhaps Optim runs multiple things in parallel and as a result there's a race?
In any case, this appears to be an Optim usage issue?
so what could cause the inplace
to fail? I don't understand what the warning msg is saying exaactly, what is bcloader
?
@Moelf try again on main. For some reason g_fix!
now appears to work with the jll bump so there might've been a weird aliasing bug that was fixed.
Please reopen if you still see this.
MWE:
and non-inplace version sees to give wrong result: