TuringLang / Turing.jl

Bayesian inference with probabilistic programming.
https://turinglang.org
MIT License
2.03k stars 218 forks source link

Original and collapsed models specification for MFM model #427

Closed mlomeli1 closed 5 years ago

mlomeli1 commented 6 years ago

I have the following mixture of finite mixtures model:

@model MFM(gam,alp,p, μ, σ) = begin m ~ Poisson(gam) probs ~ Dirichlet(m,alp) z ~ Categorical(probs) x ~ Normal(μ[z], σ) end

This model is similar to the GMM example but the difference is that the number of total components is random. Also, I would like to add a prior for the means (say Normal so it's conjugate), should I do this after I write my model or how can I include it in the model specification?
I would also like to do inference but perhaps it would be much better to use a collapsed representation of the model. Do I need to do a new method with the joint distribution and then rewrite a new collapsed model like in the cGMM2 one used for NUTS? Please help @yebai @emilemathieu , @xukai92 .

mlomeli1 commented 6 years ago

I think the collapsed model would look something like this: @model mixtureoffinitemixtures(y) = begin N = length(y) H = Normal(mu_0, sigma_0) x = tzeros(Float64, N); z = tzeros(Int, N) k = 1 for i in 1:N ps = vcat(old_ratio(block_sizes,n,k), new_ratio(n,k)) z[i] ~ Categorical(ps) if z[i] > k k = k + 1 x[k] ~ H end y[i] ~ Normal(x[z[i]], sigma_1) end end

The "old_ratio.jl" function outputs a vector of size k with the probabilities of belonging to an existing cluster and "new_ratio.jl" is just the probability of belonging to a new cluster. Note that this is super similar to @emilemathieu's code using the hybrid sampler representation. I tried running Turing.jl/test/rpm.jl/imm.jl but it gives the following error: UndefVarError: SBS_DP not defined. Could you help me to sort this out so I can run Emile's code and contribute with my example?

emilemathieu commented 6 years ago

Hi Maria (@mlomeli1) ! Sorry for the late reply.

With the changes made end of 2017, Turing is indeed able to deal with random sizes of random vectors. SMC should run without any issue. Yet you can't run HMC-like (such as NUTS) on a space with random dimensions.

The error you have is because the imm example is on a different branch than master, so it cannot find SBS_DP (the size-biased distribution of the DP).

The following should work:

@model mixtureoffinitemixtures(y) = begin
  N = length(y)
  H = Normal(mu_0, sigma_0)
  x = tzeros(Float64, N); z = tzeros(Int, N)
  k = 1
  for i in 1:N
    ps = vcat(old_ratio(block_sizes,n,k), new_ratio(n,k))
    z[i] ~ Categorical(ps)
    if z[i] > k
      k = k + 1
      x[k] ~ H
    end
    y[i] ~ Normal(x[z[i]], sigma_1)
  end
end

sampler = SMC(50)
results = sample(infiniteMixture(data), sampler)

Best, Emile

mlomeli1 commented 6 years ago

That's great, thank you @emilemathieu . I will try it out in the collapsed version so I won't have to deal with the parameters. I guess we still need to think about how to do HMC for this case? Also, if I wanted to run the imm example, would it work if I save the file pyp locally and add include("SBS_DP.jl") (analogously include("SBS_NIGP.jl") )or do I need to clone the whole branch?

emilemathieu commented 6 years ago

The easiest may be to include the folder Turing.jl/src/samplers/support/rpm/ and then make sure toinclude("support/rpm/rpm.jl") in sampler.jl.

You should be able to perform inference with a Gibbs sampler, using CSMC (or other valid particle methods) to target random variables with varying dimensions and HMCs to target other random variables (such as hyperparameters).

mlomeli1 commented 6 years ago

Hi @emilemathieu , I stopped working on this issue but now resumed it. I cannot make the imm example example work yet. How can I access the folder Turing.jl/src/samplers/support/rpm/ in my machine so I can include("support/rpm/rpm.jl") in sampler.jl? Maybe @trappmartin has made this example run?

yebai commented 6 years ago

@trappmartin Maria is a collaborator on the Turing BNP project. Could you help Maria out with this issue if possible, please?

trappmartin commented 6 years ago

Hi, @mlomeli1. I haven't looked at it so far, but I'm happy to take a look asap. I'll come back to you soon.

mlomeli1 commented 6 years ago

Hi @yebai and @trappmartin , thank you very much, that will be very useful :)

trappmartin commented 6 years ago

Hi @mlomeli1, I just checked the example script of the iMM using the HEAD of the project-bnp branch. And it seems to work fine on a Mac (with minor adjustments of the Makefile). Are you using this branch? Also, could you explain in more detail what are the problems you are having?

mlomeli1 commented 6 years ago

Hi @trappmartin , thank you very much for having a look! What I did was run the following (from Emile's imm example):

using Turing
#using Turing: SBS_DP, SBS_PYP, T_NIGP, SBS_NIGP
using Turing: realpart, CHUNKSIZE
srand(100)
include("SBS_NIGP.jl")
data = [9172.0;9350.0;9483.0;9558.0;9775.0;10227.0;10406.0;16084.0;16170.0;18419.0;18552.0;18600.0;18927.0;19052.0;19070.0;19330.0;19343.0;19349.0;19440.0;19473.0;19529.0;19541.0;19547.0;19663.0;19846.0;19856.0;19863.0;19914.0;19918.0;19973.0;19989.0;20166.0;20175.0;20179.0;20196.0;20215.0;20221.0;20415.0;20629.0;20795.0;20821.0;20846.0;20875.0;20986.0;21137.0;21492.0;21701.0;21814.0;21921.0;21960.0;22185.0;22209.0;22242.0;22249.0;22314.0;22374.0;22495.0;22746.0;22747.0;22888.0;22914.0;23206.0;23241.0;23263.0;23484.0;23538.0;23542.0;23666.0;23706.0;23711.0;24129.0;24285.0;24289.0;24366.0;24717.0;24990.0;25633.0;26960.0;26995.0;32065.0;32789.0;34279.0]
data /= 1e4
mu_0 = mean(data); sigma_0 = 1/sqrt(0.635); sigma_1 = sigma_0/15

@model infiniteMixture(y) = begin
  N = length(y)
  H = Normal(mu_0, sigma_0)

  x = tzeros(Float64, N); J = tzeros(Float64, N); z = tzeros(Int, N)
  k = 0
  T = 10
  T_surplus = T

  for i in 1:N
    ps = vcat(J[1:k]/T, T_surplus/T)
    z[i] ~ Categorical(ps)
    if z[i] > k
      k = k + 1
    #   J[k] ~ SBS_DP(10, T_surplus)
    #   J[k] ~ SBS_PYP(0.5, 1, k, T_surplus)
      J[k] ~ SBS_NIGP(T_surplus)
      x[k] ~ H
      T_surplus -= J[k]
    end
    y[i] ~ Normal(x[z[i]], sigma_1)
  end
end

sampler = SMC(50)
permutation = randperm(length(data))
results = sample(infiniteMixture(data[permutation]), sampler)

and I get the error UndefVarError: SBS_NIGP not defined. Note that when I installed Turing it was some time ago so it could be that I don't have the latest version. What I will do now is to install Julia and Turing from scratch (am on my work laptop) and then clone the project-bnp branch and try to run it. I will report back briefly.

mlomeli1 commented 6 years ago

I am having trouble making Julia work in Atom :(. @trappmartin , I just invited you to a repo that has some code for the mixture of finite mixtures model that I talk about when I opened this issue. Atm am having very limited time but can tell you more about my project if you have time/are interested? I will have another go at running the imm example this evening from my home laptop where everything is installed.

trappmartin commented 6 years ago

Hi Maria, I'm happy to hear more about the project. Let's talk while I'm in Cambridge.

Regarding the code you posted, it seems you are using your own implementation of the SBS_NIGP function. Is there a reason for doing that and for not using the using Turing: SBS_NIGP statement? Depending on the implementation of SBS_NIGP in SBS_NIGP.jl you might need further dependencies. But that's hard to tell without seeing the implementation of SBS_NIGP in SBS_NIGP.jl.

mlomeli1 commented 6 years ago

Hi @trappmartin, thank you, I wrote you an email yesterday. Is your email still mt752@cam.ac.uk? I was suggesting to talk today via skype, does that work? Thank you.

mlomeli1 commented 6 years ago

@trappmartin helped a lot yesterday. Am still a newbie so am having some trouble integrating my code for the MFM in Turing. Is there a jupiter notebook somewhere that uses SMC for the Dirichlet process mixture model using the Chinese restaurant representation? @xukai92 , @yebai, @trappmartin ? Cheers!

mlomeli1 commented 6 years ago

Ok, I have created an example for the DPM: @model DPmixture(y) = begin N = length(y) H = Normal(mu_0, sigma_0)

x = tzeros(Float64, N); J = tzeros(Float64, N); z = tzeros(Int, N) k = 1

for i in 1:N ps = vcat(J[1:k], α) ps = ps/sum(ps) z[i] ~ Categorical(ps) J[z[i]] = J[z[i]]+1 print(J) if z[i] > k k = k + 1 x[k] ~ H end y[i] ~ Normal(x[z[i]], sigma_1) end end

sampler = SMC(50) permutation = randperm(length(data)) results = sample(DPmixture(data[permutation]), sampler)

After the ProbProg submission I can contribute it in form of a notebook :) This is very simple and it is based on the imm example example from @emilemathieu . Will work on the MFM now

emilemathieu commented 6 years ago

@mlomeli1 How this eventually went through ?

mlomeli1 commented 6 years ago

Hi @emilemathieu , it went well, thank you! I managed to submit and I will be presenting a poster about it at ProbProg :) Are you going?

trappmartin commented 6 years ago

Congrats! Would you be willing to contribute a DPM example in form of a notebook?

As I’ll prepare some new tutorials for Turing, it would be great if I could ask you (via email) about possible pitfalls you encountered while using Turing.

mlomeli1 commented 6 years ago

Yes, definitely @trappmartin ! I will be working on my poster and one of the experiments I want to do is compare the MFM (mixture of finite mixtures) vs the DPM. I will spend some time doing this in the next few weeks so will keep you posted and can create a notebook for both.