cscherrer / Soss.jl

Probabilistic programming via source rewriting
https://cscherrer.github.io/Soss.jl/stable/
MIT License
412 stars 30 forks source link

README example fails #337

Closed cscherrer closed 2 years ago

cscherrer commented 2 years ago

@sethaxen pointed out that the README example is now failing. Here's the code:

using MeasureTheory, Soss, SampleChainsDynamicHMC

m = @model x begin
    α ~ Lebesgue(ℝ)
    β ~ Normal()
    σ ~ Exponential()
    y ~ For(x) do xj
        Normal(α + β * xj, σ)
    end
    return y
end
x = randn(20)
predα = predictive(m, :α)
y = rand(predα(x=x,α=10.0))
post = sample(m(x=x) | (y=y,), dynamichmc())

This currently fails, with the error seeming to originate from calls like this:

julia> logdensityof(m(x=x) | (y=y,), (σ=2,α=3,β=5))
ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] getproperty
   @ ./Base.jl:38 [inlined]
 [2] macro expansion
   @ ~/.julia/packages/GeneralizedGenerated/72VHh/src/closure_conv.jl:132 [inlined]
 [3] _logdensityof(M::Type{GeneralizedGenerated.NGG.TypeLevel{Module, "Buf{17}()"}}, _m::Model{NamedTuple{(:x,)}, GeneralizedGenerated.NGG.TypeLevel{Expr, "Buf{23}()"}, GeneralizedGenerated.NGG.TypeLevel{Module, "Buf{17}()"}}, _args::NamedTuple{(:x,), Tuple{Vector{Float64}}}, _data::NamedTuple{(:y,), Tuple{Vector{Float64}}}, _pars::NamedTuple{(:σ, :α, :β), Tuple{Int64, Int64, Int64}})
   @ Soss ~/git/Soss.jl/src/primitives/logdensity.jl:39
 [4] logdensityof(c::Soss.ConditionalModel{NamedTuple{(:x,)}, GeneralizedGenerated.NGG.TypeLevel{Expr, "Buf{23}()"}, GeneralizedGenerated.NGG.TypeLevel{Module, "Buf{17}()"}, NamedTuple{(:x,), Tuple{Vector{Float64}}}, NamedTuple{(:y,), Tuple{Vector{Float64}}}}, x::NamedTuple{(:σ, :α, :β), Tuple{Int64, Int64, Int64}})
   @ Soss ~/git/Soss.jl/src/primitives/logdensity.jl:8
 [5] top-level scope
   @ REPL[21]:1

sourceLogdensityOf()(m) produces this, which looks ok to me:

julia> sourceLogdensityOf()(m)
quote
    _ℓ = 0.0
    _ℓ += Soss.logdensityof(Exponential(), σ)
    σ = Soss.predict(Exponential(), σ)
    _ℓ += Soss.logdensityof(Normal(), β)
    β = Soss.predict(Normal(), β)
    _ℓ += Soss.logdensityof(Lebesgue(ℝ), α)
    α = Soss.predict(Lebesgue(ℝ), α)
    _ℓ += Soss.logdensityof(For(x) do xj
                #= REPL[6]:6 =#
                Normal(α + β * xj, σ)
            end, y)
    y = Soss.predict(For(x) do xj
                #= REPL[6]:6 =#
                Normal(α + β * xj, σ)
            end, y)
    nothing
    return _ℓ
end

Originally posted by @cscherrer in https://github.com/cscherrer/Soss.jl/issues/336#issuecomment-1137975734

cscherrer commented 2 years ago

Note that this works just fine:

using MeasureTheory, Soss, SampleChainsDynamicHMC

m = @model x begin
    α ~ Lebesgue(ℝ)
    β ~ Normal()
    σ ~ Exponential()
    y ~ For(x) do xj
        Normal(α + β * xj, σ)
    end
    return y
end
x = randn(20)
predα = predictive(m, :α)
y = rand(predα(x=x,α=10.0))

_args = (x=x,)
_data = (y=y,)
_pars = (σ=2,α=3,β=5)

body = quote
    let
        $(sourceLogdensityOf()(m) |> Soss.loadvals(typeof.((_args, _data, _pars))...))
    end
end

eval(body)

with the last bit giving

julia> body = quote
           let
               $(sourceLogdensityOf()(m) |> Soss.loadvals(typeof.((_args, _data, _pars))...))
           end
       end
quote
    #= REPL[9]:2 =#
    let
        #= REPL[9]:3 =#
        begin
            x::Vector{Float64} = _args.x
            y::Vector{Float64} = _data.y
            σ::Int64 = _pars.σ
            α::Int64 = _pars.α
            β::Int64 = _pars.β
            _ℓ = 0.0
            _ℓ += Soss.logdensityof(Exponential(), σ)
            σ = Soss.predict(Exponential(), σ)
            _ℓ += Soss.logdensityof(Normal(), β)
            β = Soss.predict(Normal(), β)
            _ℓ += Soss.logdensityof(Lebesgue(ℝ), α)
            α = Soss.predict(Lebesgue(ℝ), α)
            _ℓ += Soss.logdensityof(For(x) do xj
                        #= REPL[2]:6 =#
                        Normal(α + β * xj, σ)
                    end, y)
            y = Soss.predict(For(x) do xj
                        #= REPL[2]:6 =#
                        Normal(α + β * xj, σ)
                    end, y)
            nothing
            return _ℓ
        end
    end
end

julia> eval(body)
-272.341

This makes me think it's either bug in GG, or in how we're calling GG.

cscherrer commented 2 years ago

@thautwarm if I modify _logdensityof like this:

function _logdensityof(::Type{M}, _m::Mdl, _args::A, _data::D, _pars::P) where {M<:TypeLevel, Mdl<:Model, A, D, P}
    body = type2model(Mdl) |> sourceLogdensityOf() |> loadvals(A,D,P)
    @under_global M @q let 
        $body
    end
end

(take out the @gg, and refer to types more explicitly) I get

julia> logdensityof(m(x=x) | (y=y,), (σ=2,α=3,β=5))
UnderGlobal(TypeLevel{Module, "Buf{17}()"}, :(let
      begin
          x::Vector{Float64} = _args.x
          y::Vector{Float64} = _data.y
          σ::Int64 = _pars.σ
          α::Int64 = _pars.α
          β::Int64 = _pars.β
          _ℓ = 0.0
          _ℓ += Soss.logdensityof(Exponential(), σ)
          σ = Soss.predict(Exponential(), σ)
          _ℓ += Soss.logdensityof(Normal(), β)
          β = Soss.predict(Normal(), β)
          _ℓ += Soss.logdensityof(Lebesgue(ℝ), α)
          α = Soss.predict(Lebesgue(ℝ), α)
          _ℓ += Soss.logdensityof(For(x) do xj
                      Normal
                      α + β * xj
                      σ
                  end, y)
          y = Soss.predict(For(x) do xj
                      Normal
                      α + β * xj
                      σ
                  end, y)
          nothing
          return _ℓ
      end
  end))

It's splitting Normal(α + β * xj, σ) up, I'm not sure why. Didn't we see something like this before? The "access to undefined reference" error is vague and not really helpful at all; could this be the cause of it? I'm not seeing any issues in Soss itself, but maybe I'm missing something.

thautwarm commented 2 years ago

Still haven't addressed the concrete issue. I think JuliaVariables is going wrong, and GG seems to fail at its code transformation. I tried manually typing Soss codegen process and find it (at @gg macro)

cscherrer commented 2 years ago

Ok thanks, I wasn't sure this was the same issue. If it is, hopefully it might help to have a second angle to look at it from. I wish I had a deeper understanding of GG so this wouldn't be all on you; let me know if there's anything I can do to help :)

thautwarm commented 2 years ago

I could help with addressing this issue on weekends, or try to fully migrate the codebase to RuntimeGeneratedFunctions.

cscherrer commented 2 years ago

I like GG a lot, especially if there could be a route to breaking up bigger functions. RGF is only an option if there's a workaround for the closures issue, since things like For are often in terms of a closure. Even if that were fixed, I'd still need GG for some utility functions.

Anyway, weekends are fine, especially since this GG issues only come up occasionally. And if there's a point where the future of GG is in question, we can address this again and plan from there.

thautwarm commented 2 years ago

working on this.

cscherrer commented 2 years ago

Great! I'm around, let me know if there's anything I can do to help

thautwarm commented 2 years ago

331 finally produce such code from GG:

quote
    let M.contents
        M = Core.Box()
        begin
            begin
                #= ...\Git\GeneralizedGenerated.jl\src\closure_conv.jl:56 =#
                let freevars = (M,)
                    #= ...\Git\GeneralizedGenerated.jl\src\closure_conv.jl:57 =#
                    (GeneralizedGenerated.Closure){function = (M, _rng;) -> begin
    begin
        M.contents = (Main).rand(_rng, (Main).Normal())
        (M = M.contents,)
    end
end, Base.typeof(freevars)}(freevars)
                end
            end
        end
    end
end

The closure conversion is incorrect when a let-bound variable is shared to sub-contexts, where the declaration is changed from let M; ... end to let M.contents; ... end

thautwarm commented 2 years ago

I need to add a field to JuliaVariables.Var to indicate LHS/RHS info, and use this to fix the issue.

cscherrer commented 2 years ago

Do you think we really need Core.Box for this? I see this coming up in some other cases too. Doesn't this hurt performance?

thautwarm commented 2 years ago

Yes, it does hurt performance, but to allow re-assiging variables (any type) in closures, we have to do this. GG cannot use type infer results to optimize Core.Box to Ref..

However, I think I can make a special one for users who want this.. Ah, I remember I said this 2 years ago..

cscherrer commented 2 years ago

I think it's fine to restrict the type at codegen time, if that's what you mean. For a different type we can just regenerate the code

cscherrer commented 2 years ago

Also I think I may have all of the types available statically

thautwarm commented 2 years ago

I think it's fine to restrict the type at codegen time, if that's what you mean. For a different type we can just regenerate the code

I'm not sure if GG can achieve this. See the following code:

(f1, f2) =  let M = 1
    function f1(x)
        M = x
    end
    function f2()
        M
    end
    f1, f2
end

julia> f1(2)
2

julia> f2()
2

julia> f1("asda")
"asda"

julia> f2()
"asda"

It's hard for GG to generate a typed closure variable M

cscherrer commented 2 years ago

Hmm, yes that is tricky. Does the Soss example really need so much flexibility in order to work?

cscherrer commented 2 years ago

I had thought that even opaque closure could be fine, but maybe I'm missing something

thautwarm commented 2 years ago

331 is fixed with my patch, but this issue still fail.

cscherrer commented 2 years ago

If we forced everything to be type-stable -- no boxes -- do you see anything that would break?

cscherrer commented 2 years ago

I'm seeing other cases where this problem comes up. In Tilde, say we start with

julia> m = @model begin
           β ~ Normal()
           y ~ For(20) do j
               Normal(μ = β * j)
           end
       end;

julia> post = m(x=x) | (y=y,)
ModelPosterior given
    arguments    (:x,)
    observations (:y,)
@model begin
        β ~ Normal()
        y ~ For(20) do j
                Normal(μ = β * j)
            end
    end

Then logdensityof generates this code:

julia> logdensityof(post, (β=2,))
quote
    $(Expr(:meta, :inline))
    β = Core.Box()
    $(Expr(:meta, :((Main).inline)))
    local _retn
    _args = (argvals)(_mc)
    _obs = (observations)(_mc)
    _cfg = (Main).merge(_cfg, (args = _args, obs = _obs, pars = _pars))
    local x
    local β.contents
    local y
    x::Vector{Float64} = _args.x
    y::Vector{Float64} = _obs.y
    β.contents::Int64 = _pars.β
    (β.contents, _ctx, _retn) = (tilde)(logdensityof, (opticcompose)(), static(:β), (Unobserved)(β.contents), (Main).Normal(), _cfg, _ctx)
    (Main).isa(_retn, (Main).Tilde.ReturnNow) && return _retn.value
    (y, _ctx, _retn) = (tilde)(logdensityof, (opticcompose)(), static(:y), (Observed)(y), 
        (Main).For(begin
                let freevars = (β,)
                    (Closure){function = (β, j;) -> begin
                        begin
                            (Main).Normal(μ = (Main).:*(β.contents, j))
                        end
                    end, Base.typeof(freevars)}(freevars)
                end
            end, 20), 
        _cfg, _ctx)
    (Main).isa(_retn, (Main).Tilde.ReturnNow) && return _retn.value
    return (#61)(_ctx, _ctx)
end

That local β.contents throws an error. But I don't see why β needs to be boxed at all

thautwarm commented 2 years ago

If we forced everything to be type-stable -- no boxes -- do you see anything that would break?

I can try this tomorrow. I can make a special @gg, maybe @typed_gg this week.

cscherrer commented 2 years ago

Sounds good, thank you!

thautwarm commented 2 years ago

I'm seeing other cases where this problem comes up. In Tilde, say we start with

I dive into the generated code, and saw this generated code (by GG) for this issue:

 quote
│        let M
│            α = Core.Box()
│            σ = Core.Box()
│            β = Core.Box()
│            begin
│                begin
│                    x::Vector{Float64} = _args.x
│                    y::Vector{Float64} = _data.y
│                    σ.contents::ForwardDiff.Dual{ #= VERY LONG =# } = _pars.σ

It's very interesting that such code is not allowed:

julia> x :: Int = 2
ERROR: syntax: type declarations on global variables are not yet supported
Stacktrace:
 [1] top-level scope
   @ REPL[3]:1

julia> let
           x :: Int = 2
       end
2

julia> let x = Core.Box()
           x.contents::Int = 2
       end
ERROR: UndefRefError: access to undefined reference
Stacktrace:
 [1] getproperty(x::Core.Box, f::Symbol)
   @ Base .\Base.jl:42
 [2] top-level scope
   @ REPL[5]:2
thautwarm commented 2 years ago

I'm sorry for this issue as it's totally a unwared bug, but if you can generate

a = b :: t

instead of

a :: t = b

then we can avoid this issue.

cscherrer commented 2 years ago

Sure, that's no problem. But I had added the local declarations to avoid some instability before, and now I get local β.contents, which throws an error.

Maybe I can get away without the local stuff now? I'll try that

thautwarm commented 2 years ago

jusr wait for a while, submitting my code.

thautwarm commented 2 years ago

When you have time, please try GG's master branch with Soss?

331 should be fixed.

Changing a :: t = b to a = b :: t shall fix this issue I think.

thautwarm commented 2 years ago

plus: we do need debuggers for generated functions... viewing the generated code is soooooo difficult!

cscherrer commented 2 years ago

:100: There's a nice trick of just removing the @generated, but for that to work requires that everything outside quotes refer to the types, not the values.

In Soss there's sourceLogdensityof etc, but that doesn't quite get all of the code. I think with some updates I can make it better.

cscherrer commented 2 years ago

Ok, I had been trying

@gg function _logdensityof(::Type{M}, _m::Mdl, _args::A, _data::D, _pars::P) where {M<:TypeLevel, Mdl<:Model, A, D, P}
    body = type2model(Mdl) |> sourceLogdensityOf() |> loadvals(A,D,P)
    @under_global M @q let 
        $body
    end
end

This fails with the same error, but now the original works:

@gg function _logdensityof(M::Type{<:TypeLevel}, _m::Model, _args, _data, _pars)
    body = type2model(_m) |> sourceLogdensityOf() |> loadvals(_args, _data, _pars)
    @under_global from_type(_unwrap_type(M)) @q let M
        $body
    end
end
cscherrer commented 2 years ago

BTW the type annotation approach I was using comes from https://docs.julialang.org/en/v1/manual/performance-tips/#man-performance-annotations

(emphasis added)

If captured variables are used in a performance-critical section of the code, then the following tips help ensure that their use is performant. First, if it is known that a captured variable does not change its type, then this can be declared explicitly with a type annotation (on the variable, not the right-hand side):

function abmult2(r0::Int)
    r::Int = r0
    if r < 0
        r = -r
    end
    f = x -> x * r
    return f
end
cscherrer commented 2 years ago

@sethaxen I've just tagged a new release that includes the README example in the tests. This temporarily uses a modified GeneralizedGenerated that's hard-coded in the repo, so GG is currently not a dependency. We'll add it back once there's a more general solution to the current GG issue that's blocking us.

Closing this now, please reopen if you still see the same problem. And thanks again :)