API to handle nested dimensional stochastic arguments

binarybana commented 10 years ago

I'm trying to implement a model (all code and errors here) where each lambda is two dimensional (across i and j), and is distributed according to a multivariate normal. (see \lambda in plate model below but ignore the y variable).

plate_model

When defining the stochastic variable lambda in the model section, I don't see how to do this other than as a one dimensional array of Multivariate Normal distributions (as seen on line 32), but then when I try to initialize the chain, it gives me the error:

julia> include("mpm.jl")
MCMC Simulation of 10000 Iterations x 4 Chains...

exception on 1: ERROR: `convert` has no method matching convert(::Type{Float64}, ::Array{Float64,1})
 in copy! at abstractarray.jl:149
 in setinits! at /home/jason/.julia/Mamba/src/model/dependent.jl:120
 in setinits! at /home/jason/.julia/Mamba/src/model/initialization.jl:26
 in mcmc_sub! at /home/jason/.julia/Mamba/src/model/mcmc.jl:28
 in anonymous at multi.jl:652
 in run_work_thunk at multi.jl:613
 in remotecall_fetch at multi.jl:686
 in remotecall_fetch at multi.jl:701
 in anonymous at task.jl:1348
ERROR: `convert` has no method matching convert(::Type{Chains}, ::MethodError)
 in getindex at array.jl:121
 in mcmc at /home/jason/.julia/Mamba/src/model/mcmc.jl:20
 in include at ./boot.jl:245
 in include_from_node1 at ./loading.jl:128
while loading /home/jason/GSP/research/samc/mamba/mpm.jl, in expression starting on line 47

So it looks like it is trying to convert the array of arrays into an array of floats. I've also tried considering it as a two dimensional stochastic variable (with matrix initialization), but then I don't see how to represent the stochastic variable (you can see the error message at the gist above).

Hopefully that wasn't completely unintelligible. Thanks!

brian-j-smith commented 10 years ago

@binarybana: Thanks for explaining your model and providing the code. What you're trying to do makes perfect sense... although, there are two things in your code that are on my todo list for the package, but not yet implemented. (I was wondering how long it would be before someone wanted to do these - turns out, not very long.)

The first one is returning a Distributions array with multivariate distributions (MultivariateNormal) in the elements. Right now, the package only supports return types that are either a single univariate/multivariate/matrix distribution or an array of univariate distributions. In your case, a work-around would be to define lambda (lam) to be a 1-dimensional vector for which a single multivariate normal distribution is specified, with the covariance matrix being a Kronecker product of sig and an identity matrix. I can see how being able to handle arrays of multivariate distributions would be a nice feature, but it may not move up on my todo list for a while.

The second one is sampling a covariance matrix. A challenge in doing this is ensuring that draws produce positive-definite matrices. The currently implemented samplers (AMM, AMWG, NUTS, Slice, SliceWG) are not designed to satisfy that condition. To do this, the package will need a new sampler designed specifically for covariance matrices. This is higher up on my todo list. I'll see what I can do on this front this weekend.

Best.

binarybana commented 10 years ago

Thanks for the response. I'm curious how you plan on writing your sampler for PD matrices? I have been tackling this problem myself and I have thought of a few (untested!) approaches for proposal distributions for Metropolis Hastings:

InverseWishart draws using the current covariance matrix as the (properly scaled) scale matrix
Using an underlying matrix A which is perturbed element-wise by normal distributions, A'*A is then the covariance matrix
Using Cholesky or eigen decomposition and somehow modifying them at each step.

I'm testing approach 2 here (just ignore my simple, non-adaptive MH!), and it seems to be doing quite well (and significantly faster than InverseWishart sampling).

brian-j-smith commented 10 years ago

Step one of my plan is to come up with a plan. Probably a good place to start is with approaches that are working well for other people. So, your input here is greatly appreciated. I'll definitely take a look at your code and let you know if I have any questions or if it looks like something that would work in Mamba.

Otherwise, I need to do some literature review and review of other software to get up to speed on the topic of general samplers for PD matrices. Starting with simpler approaches also makes sense. I'll keep you posted.

brian-j-smith commented 10 years ago

@binarybana: I'm going to take a stab at implementing method 3; i.e., sampling parameters for the covariance matrix factorization Sigma = L * D * L', where L is lower-unit triangular and D is diagonal with positive elements. This should be the most straightforward approach right now in Mamba, be able to utilize the existing samplers, and require no special action on the part of the user. It will take a couple weeks to get it implemented if all goes as expected (knock on wood).

brian-j-smith commented 10 years ago

@binarybana: Sampling of covariance matrices will be working in the next version of Mamba, to be released shortly after the next Distributions package version. Changes were needed to both packages to get this working. An efficient method for specifying multivariate normals, like the lambda parameters in your model, will also be included.

brian-j-smith commented 10 years ago

@binarybana - You've probably graduated, raised a family, and become a full professor by now... However, in case you are are still interested, below is a working version of your model.

## Requires:
##   Mamba 0.3.3-
## Recommended:
##   Distributions 0.5.5-
##   (ealier versions may work but be prone to SingularException errors)

using Mamba

data = (Symbol=>Any)[
    :x1 => [10, 12, 14, 9],
    :x2 => [10, 11, 12, 8],
    :d => 10.0]

data[:y] = vec([data[:x1] data[:x2]]')
N = data[:N] = length(data[:y])
data[:X] = kron(ones(div(N, 2)), eye(2))

model = Model(
  y = Stochastic(1,
    @modelexpr(N, lam, d,
      Distribution[
      begin
        rate = d * exp(lam[i])
        Poisson(rate)
      end
      for i = 1:N]
    ),
    false),

  lam = Stochastic(1,
    @modelexpr(X, mu, sig,
      BDiagNormal(X * mu, sig)
    ),
    true),

  mu = Stochastic(1, :(IsoNormal(2, 10)), true),
  sig = Stochastic(2, :(InverseWishart(10, eye(2))), true)
)

inits = [[:y => data[:y], :lam => randn(N),
          :mu => zeros(2), :sig => eye(2)] for i = 1:4]

scheme = [AMWG([:lam], ones(N)),
          AMWG([:mu], ones(2) / 10),
          AMWG([:sig], ones(3))]
setsamplers!(model, scheme)

sim = mcmc(model, data, inits, 5000, burnin=250, thin=2, chains=4)
describe(sim)

Some things to note:

There are different ways the model can be implemented. I used a regression style implementation, with all outputs stored in a 1-dimensional vector y, and with a design matrix X to map the mu parameters to the outputs.
BDiagNormal is a new distribution implemented in the Mamba package as a multivariate normal distribution with a block-diagonal covariance matrix. It's first and second arguments are a vector of mean values and a sub-matrix to be replicated as the block diagonal elements of the covariance matrix (off-diagonals are zero). It seems to work fine, but is not officially documented yet.
Since covariance matrices are symmetric, they have n(n+1)/2 unique parameters, where n is the matrix dimension. Consequently, that is the number of parameters to account for when defining samplers for sig; e.g., AMWG([:sig], ones(3)).
The covariance matrix is mapped to an unconstrained space, where standard samplers can be used, and then mapped back to its original space. If you are interested in the mapping, there is some detail about that in the link() function documentation and the corresponding code.

brian-j-smith commented 10 years ago

Officially implemented and documented in version 0.3.4.

binarybana commented 9 years ago

@brian-j-smith, close! I'm about to graduate, have a third kid on the way, and am still in the job application process. :)

Thanks for the detailed writeup and feature adds in Mamba.jl, it's exciting seeing this package grow so quickly.

brian-j-smith commented 9 years ago

Haha. Congratulations on all the excitement going on in your life. That’s great! Best of luck in the job search.

From: Jason Knight [mailto:notifications@github.com] Sent: Tuesday, January 06, 2015 1:58 PM To: brian-j-smith/Mamba.jl Cc: Smith, Brian J Subject: Re: [Mamba.jl] API to handle nested dimensional stochastic arguments (#2)

@brian-j-smithhttps://github.com/brian-j-smith, close! I'm about to graduate, have a third kid on the way, and am still in the job application process. :)

Thanks for the detailed writeup and feature adds in Mamba.jl, it's exciting seeing this package grow so quickly.

— Reply to this email directly or view it on GitHubhttps://github.com/brian-j-smith/Mamba.jl/issues/2#issuecomment-68923693.

brian-j-smith / Mamba.jl

API to handle nested dimensional stochastic arguments #2