Closed krz closed 8 years ago
Try not using NUTS (use Metropolis instead).
Also, do Normal(..., sd=10)
, by default, pymc3 uses precision as the second parameter.
thanks, I always forget this when switching between Stan and pymc3. I changed the sd and switched to Metropolis sampler, provided the true starting values, but it gets worse:
https://gist.github.com/krz/c47da356a7352d9f2218ffd5c096af69
What's up with the scaling kwarg?
the idea with the scaling is from this post, where it was found to be the key to convergence for mixtures: http://blog.booleanbiotech.com/static/Mixture%20of%20Gaussians%20in%20PyMC3.html
I removed the scaling and also tried the Slice sampler, no difference: https://gist.github.com/krz/2549d51650649b785b420bdecc73c1dd
I am not sure if the stan and pymc3 parametrization of the Negative Binomial is the same (but I think so), can you verify that: http://i.imgur.com/EhEy5io.png
A couple of issues with the implementation of the model. First, setting a mean-zero prior is probably not a good idea, since it is the mean to a negative binomial, which models positive counts. In fact, using a normal is not very efficient, since half of the prior probability is not valid support for the likelihood. Second, its usually best to let PyMC select the step methods for you, unless you have a good reason to do otherwise.
I managed to get a decent result with this revision of the model.
I used a Lognormal
distribution for mu
to constrain it positive. Stan allows you to use a normal because you can apply a lower constraint when you declare the variable, which essentially makes it a half normal. You could use a HalfNormal
explicitly here too.
A better approach still would be to marginalize out the bernoulli indicators, and use a mixture likelihood. This would let you use gradient-based methods throughout the model.
Thanks, this is a result I can live with. I did not expect the prior to be that important, because I have lots of data and I even did provide true starting values. In the Stan model I did not restrict the Normal for the mu
prior as well.
From your output it seems like pymc selected the same step methods as I specified manually.
Actually I am very impressed how good Stan can estimate this model, maybe because they have a dedicated mixture function.
Yeah, would be great to implement that in pymc3.
just a quick note, I have problem reproducing this result: https://gist.github.com/krz/56394b2c5a7dc89c1833b8af848f1798
If I only use 1 job I get no result. When using 4 jobs it gets better. When using more iterations it gets wors, the chains have a downward trend.
Another think that I realized in your notebook and in mine as well: The grid lines in the trace density plots are gone, is this an intended change and how can I get them back?
Hi,
I built a model in Stan for estimating the parameters of a mixture distribution of two Negative Binomials. I generated data with mu1=1 and mu2=8 and the Stan estimates are mu1=1 and mu2=9 which is fine. However, when I built the model in pymc3 using the same data I do not get even close to the true parameters, even if I specify the true parameters as starting values. Any idea what is going on here? https://gist.github.com/krz/0bc58592a5b088076c4ecb8141ee34bd