luiarthur / TuringBnpBenchmarks

Benchmarks of Bayesian Nonparametric models in Turing and other PPLs
https://luiarthur.github.io/TuringBnpBenchmarks/
MIT License
29 stars 1 forks source link

Can't replicate Turing CRP DPMM example #9

Closed luiarthur closed 4 years ago

luiarthur commented 4 years ago

I am unsuccessful in replicating the infinite GMM example here: https://turing.ml/dev/tutorials/6-infinitemixturemodel/

I'm using the Project.toml and Manifest.toml files from: https://github.com/luiarthur/TuringBnpBenchmarks/

This is what I'm trying to run. It's almost verbatim what's in the tutorial (just replaced some Unicode):

using Turing
import Turing.RandomMeasures.DirichletProcess
import Turing.RandomMeasures.ChineseRestaurantProcess
using Distributions
using PyPlot
import Random
using BenchmarkTools
import StatsBase.countmap

# FIXME: Not working???

# Define model
@model infiniteGMM(x) = begin
    nobs = length(x)

    # Hyper-parameters, i.e. concentration parameter and parameters of H.
    alpha = 1.0
    mu0 = 0.0
    sig0 = 1.0

    # Define random measure, e.g. Dirichlet process.
    rpm = DirichletProcess(alpha)

    # Define the base distribution, i.e. expected value of the Dirichlet process.
    H = Normal(mu0, sig0)

    # Latent assignment.
    z = tzeros(Int, nobs)

    # Locations of the infinitely many clusters.
    mu = tzeros(Float64, 0)

    for i in 1:nobs
        # Number of clusters.
        K = maximum(z)
        nk = Vector{Int}(map(k -> sum(z .== k), 1:K))

        # Draw the latent assignment.
        z[i] ~ ChineseRestaurantProcess(rpm, nk)

        # Create a new cluster?
        if z[i] > K
            push!(mu, 0.0)

            # Draw location of new cluster.
            mu[z[i]] ~ H
        end

        # Draw observation.
        x[i] ~ Normal(mu[z[i]], 1.0)
    end
end

# Generate data
Random.seed!(1)
data = vcat(randn(10), randn(10) .- 5, randn(10) .+ 10)
data .-= mean(data)
data /= std(data);

# Fit model
Random.seed!(2)
iterations = 1000
model_fun = infiniteGMM(data)
chain = sample(model_fun, SMC(), iterations)

This is the error I'm getting:

BoundsError: attempt to access 0-element Array{Any,1} at index [1]

Stacktrace:
 [1] getindex(::Array{Any,1}, ::Int64) at ./array.jl:788
 [2] ess(::Chains{Union{Missing, Float64},Float64,NamedTuple{(:internals, :parameters),Tuple{Array{String,1},Array{String,1}}},NamedTuple{(),Tuple{}}}; showall::Bool, sections::Array{Symbol,1}, maxlag::Int64, digits::Int64, sorted::Bool) at /home/ubuntu/.julia/packages/MCMCChains/pVyO1/src/stats.jl:376
 [3] summarystats(::Chains{Union{Missing, Float64},Float64,NamedTuple{(:internals, :parameters),Tuple{Array{String,1},Array{String,1}}},NamedTuple{(),Tuple{}}}; append_chains::Bool, showall::Bool, sections::Array{Symbol,1}, etype::Symbol, digits::Int64, sorted::Bool, args::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/ubuntu/.julia/packages/MCMCChains/pVyO1/src/stats.jl:441
 [4] describe(::IJulia.IJuliaStdio{Base.PipeEndpoint}, ::Chains{Union{Missing, Float64},Float64,NamedTuple{(:internals, :parameters),Tuple{Array{String,1},Array{String,1}}},NamedTuple{(),Tuple{}}}; q::Array{Float64,1}, etype::Symbol, showall::Bool, sections::Array{Symbol,1}, digits::Int64, sorted::Bool, args::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/ubuntu/.julia/packages/MCMCChains/pVyO1/src/stats.jl:203
 [5] describe(::IJulia.IJuliaStdio{Base.PipeEndpoint}, ::Chains{Union{Missing, Float64},Float64,NamedTuple{(:internals, :parameters),Tuple{Array{String,1},Array{String,1}}},NamedTuple{(),Tuple{}}}) at /home/ubuntu/.julia/packages/MCMCChains/pVyO1/src/stats.jl:203
 [6] describe(::Chains{Union{Missing, Float64},Float64,NamedTuple{(:internals, :parameters),Tuple{Array{String,1},Array{String,1}}},NamedTuple{(),Tuple{}}}; args::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at /home/ubuntu/.julia/packages/MCMCChains/pVyO1/src/stats.jl:174
 [7] describe at /home/ubuntu/.julia/packages/MCMCChains/pVyO1/src/stats.jl:174 [inlined]
 [8] show(::IOContext{Base.GenericIOBuffer{Array{UInt8,1}}}, ::Chains{Union{Missing, Float64},Float64,NamedTuple{(:internals, :parameters),Tuple{Array{String,1},Array{String,1}}},NamedTuple{(),Tuple{}}}) at /home/ubuntu/.julia/packages/MCMCChains/pVyO1/src/chains.jl:283
 [9] show at ./multimedia.jl:47 [inlined]
 [10] limitstringmime(::MIME{Symbol("text/plain")}, ::Chains{Union{Missing, Float64},Float64,NamedTuple{(:internals, :parameters),Tuple{Array{String,1},Array{String,1}}},NamedTuple{(),Tuple{}}}) at /home/ubuntu/.julia/packages/IJulia/DrVMH/src/inline.jl:43
 [11] display_mimestring(::MIME{Symbol("text/plain")}, ::Chains{Union{Missing, Float64},Float64,NamedTuple{(:internals, :parameters),Tuple{Array{String,1},Array{String,1}}},NamedTuple{(),Tuple{}}}) at /home/ubuntu/.julia/packages/IJulia/DrVMH/src/display.jl:67
 [12] display_dict(::Chains{Union{Missing, Float64},Float64,NamedTuple{(:internals, :parameters),Tuple{Array{String,1},Array{String,1}}},NamedTuple{(),Tuple{}}}) at /home/ubuntu/.julia/packages/IJulia/DrVMH/src/display.jl:96
 [13] #invokelatest#1 at ./essentials.jl:712 [inlined]
 [14] invokelatest at ./essentials.jl:711 [inlined]
 [15] execute_request(::ZMQ.Socket, ::IJulia.Msg) at /home/ubuntu/.julia/packages/IJulia/DrVMH/src/execute_request.jl:112
 [16] #invokelatest#1 at ./essentials.jl:712 [inlined]
 [17] invokelatest at ./essentials.jl:711 [inlined]
 [18] eventloop(::ZMQ.Socket) at /home/ubuntu/.julia/packages/IJulia/DrVMH/src/eventloop.jl:8
 [19] (::IJulia.var"#15#18")() at ./task.jl:358
trappmartin commented 4 years ago

Looks like the computation of the effective sample size has been changed and doesn't work with the resulting chain. I can have a look. For now, you can run

chain = sample(model_fun, SMC(), iterations);

To ensure that the describe function is not called. (We call it if Julia requests the show method, i.e. the REPL shows the resulting chain.)

luiarthur commented 4 years ago

Thanks. So, I tried adding the semi-colon, and it ran because no output was printed, so no calls to describe were made.

I'm able to call chain.value, but chain[:mu], etc. throws essentially the same error as before.

luiarthur commented 4 years ago

I think I'm able to continue from here. Thanks!

Should I close this issue and create a separate issue on the Turing repo?

trappmartin commented 4 years ago

Yes, that is a good idea.