Potential Venstan example? "double" mixture model

probcomp / Venturecxx

Primary implementation of the Venture probabilistic programming system

http://probcomp.csail.mit.edu/venture/

GNU General Public License v3.0

29 stars 6 forks source link

Potential Venstan example? "double" mixture model #444

Open lenaqr opened 8 years ago

lenaqr commented 8 years ago

See: https://webfiles.uci.edu/mdlee/LeeWagenmakers2013_Free.pdf ("6.4 The two country quiz") https://groups.google.com/forum/#!topic/stan-users/p2zWntwTbG0

Some context: all of the models in the Lee & Wagenmakers book were made into Stan examples that are linked from their website, except for three (due to inability to implement them in Stan); this is one of them.
It’s a double mixture model: discrete latent x_i, discrete latent z_j, and observed k_ij, where the distribution of the k_ij depends on a function of the x_i and z_j.
Summing out the latents entirely (as you would have to do in Stan) is intractable, because they are coupled in the posterior, so you would have to sum all combinations of all the x_i and z_j.
However, I think it is the case that the x_i are independent of each other conditioned on all the z_j (and k_ij), and the z_j are independent of each other conditioned on the x_i (and k_ij). So one could imagine a Venture-driven/Stan-powered Gibbs sampling scheme where Venture samples the x_i, hands it to Stan to sample the parameters integrating out the z_j, Venture samples the z_j, Stan samples parameters integrating out x_i, repeat.
It’s not that interesting because this example (and all the others in the book) were already implemented in BUGS to begin with, so a Venstan implementation would not be clearly novel, but maybe we can come up with some extension that meaningfully uses the expressive power of Venture.

fsaad commented 8 years ago

When @Axch and I discussed these examples in the L&W book before the PI meeting, the conclusion was we could not really tell a compelling story for Venstan with toy examples -- one running pattern I have seen in the literature is researchers take some probabilistic model which uses mixtures, and make it an infinite mixture which turns into an 80 page journal paper. Perhaps we can find something compelling using a similar approach (as @axch says tuning \alpha is easier than tuning K).

vkmvkmvkmvkm commented 8 years ago

Strongly disagree. A much broader audience cares about L&W than eg the Dunson paper. Plus the example Anthony found highlights the limitations of Stan's claim to "handle discrete variables".

Remember the purpose of these is not to show something is possible representationally that was previously impossible, but instead to show that with Venture, one probporg lang can be used to optimize parts of a probprog written in another, using a model of acknowledged interest. The fact that a BUGS impl exists is just validation of its interest. This is technically about illustrating SP interface capabilities/consequences, and highlighting advantages of a polyglot platform.

A separate type of point could be that with Venture we can extend models (and the associated inference schemes) written in other languages while retaining some (presumably tested) parts of the original implementation. That's not an interesting pitch to people happy with the expressiveness of the other languages or dubious about the practical utility of the fancier stuff.

A good MML paper template might be to find and compress then extend papers like Dunson's but that's for a different audience still.

On Fri, Feb 26, 2016, 6:36 AM F Saad notifications@github.com wrote:

When @Axch https://github.com/Axch and I discussed these examples in the L&W book before the PI meeting, the conclusion was we could not really tell a compelling story for Venstan with toy examples -- one running pattern I have seen in the literature is researchers take some probabilistic model which uses mixtures, and make it an infinite mixture which turns into an 80 page journal paper http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3754453/. Perhaps we can find something compelling using a similar approach (as @axch https://github.com/axch says tuning \alpha is easier than tuning K).

— Reply to this email directly or view it on GitHub https://github.com/probcomp/Venturecxx/issues/444#issuecomment-189233658 .

vkmvkmvkmvkm commented 8 years ago

Also, red flag is the word "interesting" --- key question is to whom, and do we want to reach them, and why.

On Fri, Feb 26, 2016, 10:17 AM Vikash K. Mansinghka vkmvkmvkmvkm@gmail.com wrote:

Strongly disagree. A much broader audience cares about L&W than eg the Dunson paper. Plus the example Anthony found highlights the limitations of Stan's claim to "handle discrete variables".

Remember the purpose of these is not to show something is possible representationally that was previously impossible, but instead to show that with Venture, one probporg lang can be used to optimize parts of a probprog written in another, using a model of acknowledged interest. The fact that a BUGS impl exists is just validation of its interest. This is technically about illustrating SP interface capabilities/consequences, and highlighting advantages of a polyglot platform.

A separate type of point could be that with Venture we can extend models (and the associated inference schemes) written in other languages while retaining some (presumably tested) parts of the original implementation. That's not an interesting pitch to people happy with the expressiveness of the other languages or dubious about the practical utility of the fancier stuff.

A good MML paper template might be to find and compress then extend papers like Dunson's but that's for a different audience still.

On Fri, Feb 26, 2016, 6:36 AM F Saad notifications@github.com wrote:

When @Axch https://github.com/Axch and I discussed these examples in the L&W book before the PI meeting, the conclusion was we could not really tell a compelling story for Venstan with toy examples -- one running pattern I have seen in the literature is researchers take some probabilistic model which uses mixtures, and make it an infinite mixture which turns into an 80 page journal paper http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3754453/. Perhaps we can find something compelling using a similar approach (as @axch https://github.com/axch says tuning \alpha is easier than tuning K).

— Reply to this email directly or view it on GitHub https://github.com/probcomp/Venturecxx/issues/444#issuecomment-189233658 .

fsaad commented 8 years ago

RE:

highlights the limitations of Stan's claim to "handle discrete variables".

In the thread, Carpenter clearly says that Stan can in principle handle the discretes, but it will not be tractable.

There are 2^8 possible values for x and 2^8 possible values for z in 
this tiny tiny example, but already that's 2^16, or about 64000, summands. 
I don't think there's any way to code this model efficiently in Stan. 
You can just sum over the 2^16 values of x, z, but that's going to be 
very painful to code, slow to run, and not scalable.
- Carpenter

I mention this because there appears to be an insinuation that the Stan folk are making inaccurate claims about their system.

but instead to show that with Venture, one probporg lang can be used to optimize parts of a probprog written in another

The original conclusion reached by @axch and I about the 3 examples in LW book, (and why we did not implement them for the PI meeting, instead shooting for VenStanKep), was that Venture can handle these cognitive model tractably anyway. Bolting on Stan does not achieve the stated target of "optimizing parts of a probprog" because sampling the discretes in Venture then using HMC in Stan offers no obvious advantage (computational or otherwise) over using pure Venturescript.. The cognitive example felt too artificial to tell a compelling story that we are "overcoming shortcomings of both systems" by being polyglot.

vkmvkmvkmvkm commented 8 years ago

Re Carpenter: it's useful for us to have that quote, and to remember the difference between "the language supporting something in principle" and "it actually being supported in practice, without severe and/or unpredictable restrictions".

Have you seen the Venture Goals document? There's a proposal for a VenStan interface that I think will lead to different conclusions.

On Fri, Feb 26, 2016 at 10:37 AM, F Saad notifications@github.com wrote:

highlights the limitations of Stan's claim to "handle discrete variables".

In the thread, Carpenter clearly says that Stan can in principle handle the discretes, but it will not be tractable

There are 2^8 possible values for x and 2^8 possible values for z in this tiny tiny example, but already that's 2^16, or about 64000, summands. I don't think there's any way to code this model efficiently in Stan. You can just sum over the 2^16 values of x, z, but that's going to be very painful to code, slow to run, and not scalable.

Carpenter

but instead to show that with Venture, one probporg lang can be used to optimize parts of a probprog written in another

The original conclusion reached by @axch https://github.com/axch and I about the 3 examples in LW book, (and why we did not implement them for the PI meeting, instead shooting for VenStanKep), was that Venture can handle these cognitive model tractably anyway. Bolting on Stan does not achieve the stated target of "optimizing parts of a probprog" because sampling the discretes in Venture then using HMC in Stan offers no obvious advantage (computational or otherwise) over using pure Venturescript.. The cognitive example felt too artificial to tell a compelling story that we are "overcoming shortcomings of both systems" by being polyglot.

— Reply to this email directly or view it on GitHub https://github.com/probcomp/Venturecxx/issues/444#issuecomment-189325749 .

axch commented 8 years ago

Project management decision: What are we doing with this? What are the trigger events that cause us to reconsider this?