Enzyme could not find shadow for value | with range

roflmaostc commented 3 months ago

Hi,

with Enzyme v0.12.25

This works:

# ╔═╡ 964afc32-481e-11ef-0ce7-ddc5430a25d2
using ImageShow, DifferentiationInterface, Enzyme, Optim, ImageIO

# ╔═╡ ea745c16-6359-4a18-add0-e14fff376399
using ComponentArrays

# ╔═╡ 93359818-c0f6-4e58-b721-ceb97fe4e712
gauss(x::T, y::T, σ, μ1, μ2) where T = 1 / (σ*√(2 * T(π))) * exp(- ((x - μ1)^2 + (y-μ2)^2)/ (2 * σ))

# ╔═╡ f54dcc0b-a0ee-4222-8fcb-11157a302dd0
gauss(x::T, σ, μ1, μ2) where T = 1 / (σ*√(2 * T(π))) * exp(- ((x - μ1)^2)/ (2 * σ))

# ╔═╡ 1774e002-4f39-41ec-898f-3fce76b94159
x = collect(range(-10, 10, 100))

# ╔═╡ 5723d16b-4b4c-43e2-9a15-c8f47b2bef7b
y = range(-10, 10, 100)'

# ╔═╡ 4e197a4c-559e-4f37-9585-fdadf0ba5212
measurement = gauss.(x, y, 1.2, 2.1, -3.1);

# ╔═╡ 07e333d1-adc4-4cb2-af12-73659225875c
simshow(measurement)

# ╔═╡ 2d78f7fb-0ea6-445e-a8a8-cacc32c10ed6
function create_fg!(measurement, x;  N=2)
    buffer = copy(measurement)

    f = let measurement=measurement
        function f(pp)
            buffer .= 0
            buffer .-= gauss.(x, y, pp[1], pp[2], pp[3])
            #buffer .+= measurement
            return sum(abs2, buffer)
        end
    end

    fg! = let f=f, backend=AutoEnzyme()
        function fg!(F, G, p)
            if G !== nothing
                dp = make_zero(p)
                pp = copy(p)
                #y = Enzyme.autodiff(Enzyme.ReverseWithPrimal, Duplicated(f, make_zero(f)), Duplicated(pp, dp), Const(measurement))
                y = Enzyme.autodiff(Enzyme.ReverseWithPrimal, Duplicated(f, make_zero(f)), Duplicated(pp, dp))
                G .= dp
                if F !== nothing
                    return y[2]
                end
            end
            if F !== nothing
                return f(p, measurement)
            end
        end
    end

    f, fg!
end

# ╔═╡ d65c0d01-47ba-40b4-9e74-d15116ff8bff
f2, fg2! = create_fg!(measurement, x, N=length(params0) ÷ 2)

# ╔═╡ 6d21d27e-f4cf-47d7-ba67-aae0cdfcaed3
params0 = [1.0, 1,-3 ]

# ╔═╡ ec5a93b3-9e4c-454b-a885-4f866076476d
@time f2(params0)

# ╔═╡ 5d009f77-2548-40e8-bc1c-7f7b8df4697e
grad0 = one.(params0)

# ╔═╡ 8ab69a15-fc93-4592-b36e-083be7089961
@time fg2!(2, grad0,  params0)

# ╔═╡ c03562a6-5d3a-4c09-8859-12a7098db562
grad0

This does not, the only line I change is x = collect(range(-10, 10, 100)), so I use a vector instead of a range in the function.

# ╔═╡ 964afc32-481e-11ef-0ce7-ddc5430a25d2
using ImageShow, DifferentiationInterface, Enzyme, Optim, ImageIO

# ╔═╡ ea745c16-6359-4a18-add0-e14fff376399
using ComponentArrays

# ╔═╡ 93359818-c0f6-4e58-b721-ceb97fe4e712
gauss(x::T, y::T, σ, μ1, μ2) where T = 1 / (σ*√(2 * T(π))) * exp(- ((x - μ1)^2 + (y-μ2)^2)/ (2 * σ))

# ╔═╡ f54dcc0b-a0ee-4222-8fcb-11157a302dd0
gauss(x::T, σ, μ1, μ2) where T = 1 / (σ*√(2 * T(π))) * exp(- ((x - μ1)^2)/ (2 * σ))

# ╔═╡ 1774e002-4f39-41ec-898f-3fce76b94159
x = (range(-10, 10, 100))

# ╔═╡ 5723d16b-4b4c-43e2-9a15-c8f47b2bef7b
y = range(-10, 10, 100)'

# ╔═╡ 4e197a4c-559e-4f37-9585-fdadf0ba5212
measurement = gauss.(x, y, 1.2, 2.1, -3.1);

# ╔═╡ 07e333d1-adc4-4cb2-af12-73659225875c
simshow(measurement)

# ╔═╡ 2d78f7fb-0ea6-445e-a8a8-cacc32c10ed6
function create_fg!(measurement, x;  N=2)
    buffer = copy(measurement)

    f = let measurement=measurement
        function f(pp)
            buffer .= 0
            buffer .-= gauss.(x, y, pp[1], pp[2], pp[3])
            #buffer .+= measurement
            return sum(abs2, buffer)
        end
    end

    fg! = let f=f, backend=AutoEnzyme()
        function fg!(F, G, p)
            if G !== nothing
                dp = make_zero(p)
                pp = copy(p)
                #y = Enzyme.autodiff(Enzyme.ReverseWithPrimal, Duplicated(f, make_zero(f)), Duplicated(pp, dp), Const(measurement))
                y = Enzyme.autodiff(Enzyme.ReverseWithPrimal, Duplicated(f, make_zero(f)), Duplicated(pp, dp))
                G .= dp
                if F !== nothing
                    return y[2]
                end
            end
            if F !== nothing
                return f(p, measurement)
            end
        end
    end

    f, fg!
end

# ╔═╡ d65c0d01-47ba-40b4-9e74-d15116ff8bff
f2, fg2! = create_fg!(measurement, x, N=length(params0) ÷ 2)

# ╔═╡ 6d21d27e-f4cf-47d7-ba67-aae0cdfcaed3
params0 = [1.0, 1,-3 ]

# ╔═╡ ec5a93b3-9e4c-454b-a885-4f866076476d
@time f2(params0)

# ╔═╡ 5d009f77-2548-40e8-bc1c-7f7b8df4697e
grad0 = one.(params0)

# ╔═╡ 8ab69a15-fc93-4592-b36e-083be7089961
@time fg2!(2, grad0,  params0)

# ╔═╡ c03562a6-5d3a-4c09-8859-12a7098db562
grad0

The error is:

Enzyme execution failed.

Enzyme could not find shadow for value

Current scope:

preprocess_julia_broadcasted_10135_inner.1{ { {} addrspace(10)*, { { { [2 x double], [2 x double], i64, i64 }, [1 x { [2 x double], [2 x double], i64, i64 }], double, double, double } } } } ({} addrspace(10)*, { { { [2 x double], [2 x double], i64, i64 }, [1 x { [2 x double], [2 x double], i64, i64 }], double, double, double } })

cannot find shadow for { { { [2 x double], [2 x double], i64, i64 }, [1 x { [2 x double], [2 x double], i64, i64 }], double, double, double } } %1

Stack trace

Here is what happened, the most recent locations are first:

    throwerr(cstr::Cstring) @ compiler.jl:1696
    broadcasted @ broadcast.jl:1349
    broadcasted @ broadcast.jl:1347
    broadcasted @ broadcast.jl:0
    augmented_julia_broadcasted_10135_inner_1wrap @ broadcast.jl:0
    macro expansion @ compiler.jl:6673
    enzyme_call(::Val{false}, ::Ptr{Nothing}, ::Type{Enzyme.Compiler.AugmentedForwardThunk}, ::Val{1}, ::Val{true}, ::Type{Tuple{EnzymeCore.Const{typeof(-)}, EnzymeCore.Duplicated{Matrix{Float64}}, EnzymeCore.Active{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2}, Nothing, typeof(Main.var"workspace#19".gauss), Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, LinearAlgebra.Adjoint{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, Float64, Float64, Float64}}}}}, ::Type{EnzymeCore.MixedDuplicated{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2}, Nothing, typeof(-), Tuple{Matrix{Float64}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2}, Nothing, typeof(Main.var"workspace#19".gauss), Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, LinearAlgebra.Adjoint{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, Float64, Float64, Float64}}}}}}, ::EnzymeCore.Const{typeof(Base.Broadcast.broadcasted)}, ::Type{Nothing}, ::EnzymeCore.Const{typeof(-)}, ::EnzymeCore.Duplicated{Matrix{Float64}}, ::EnzymeCore.Active{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2}, Nothing, typeof(Main.var"workspace#19".gauss), Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, LinearAlgebra.Adjoint{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, Float64, Float64, Float64}}}) @ compiler.jl:6273
    (::Enzyme.Compiler.AugmentedForwardThunk{Ptr{Nothing}, EnzymeCore.Const{typeof(Base.Broadcast.broadcasted)}, EnzymeCore.MixedDuplicated{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2}, Nothing, typeof(-), Tuple{Matrix{Float64}, Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2}, Nothing, typeof(Main.var"workspace#19".gauss), Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, LinearAlgebra.Adjoint{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, Float64, Float64, Float64}}}}}, Tuple{EnzymeCore.Const{typeof(-)}, EnzymeCore.Duplicated{Matrix{Float64}}, EnzymeCore.Active{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2}, Nothing, typeof(Main.var"workspace#19".gauss), Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, LinearAlgebra.Adjoint{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, Float64, Float64, Float64}}}}, 1, true, Nothing})(::EnzymeCore.Const{typeof(Base.Broadcast.broadcasted)}, ::EnzymeCore.Const{typeof(-)}, ::Vararg{Any}) @ compiler.jl:6161
    runtime_generic_augfwd(activity::Type{Val{(false, false, true, true)}}, width::Val{1}, ModifiedBetween::Val{(true, true, true, true)}, RT::Val{@NamedTuple{1, 2, 3}}, f::typeof(Base.Broadcast.broadcasted), df::Nothing, primal_1::typeof(-), shadow_1_1::Nothing, primal_2::Matrix{Float64}, shadow_2_1::Matrix{Float64}, primal_3::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2}, Nothing, typeof(Main.var"workspace#19".gauss), Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, LinearAlgebra.Adjoint{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, Float64, Float64, Float64}}, shadow_3_1::Base.RefValue{Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{2}, Nothing, typeof(Main.var"workspace#19".gauss), Tuple{StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, LinearAlgebra.Adjoint{Float64, StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}}, Float64, Float64, Float64}}}) @ jitrules.jl:313
    f @ [Other cell: line 8](http://localhost:1235/edit?id=96074cc4-49e2-11ef-2bdf-ab7787502e50#2d78f7fb-0ea6-445e-a8a8-cacc32c10ed6)

            function f(pp)
                buffer .= 0
                buffer .-= gauss.(x, y, pp[1], pp[2], pp[3])
                #buffer .+= measurement
                return sum(abs2, buffer)

    [Show more...](http://localhost:1235/edit?id=96074cc4-49e2-11ef-2bdf-ab7787502e50#)

wsmoses commented 3 months ago

Working on this, but also FYI for you code, the use of the buffer makes your code both slower [in both primal and reverse], and harder to differentiate.

If you can point me to the code that uses this pattern I can offer some comments to improve perf/compatibility

roflmaostc commented 3 months ago

Thanks for the hint!

I thought I need the buffer to avoid the allocation in the forward pass?

How would I write these line in an Enzyme style?

        function f(pp)
            buffer .= 0
            buffer .-= gauss.(x, y, pp[1], pp[2], pp[3])
            #buffer .+= measurement
            return sum(abs2, buffer)
        end

Thanks!

wsmoses commented 2 months ago

I think probably a generator would avoid things entirely [no temporary buffer, nor allocation]

wsmoses commented 1 month ago

@roflmaostc does this still err on current main?

wsmoses commented 1 month ago

okay locally this seems to run, closing

EnzymeAD / Enzyme.jl

Enzyme could not find shadow for value | with range #1673