probcomp / Venturecxx

Primary implementation of the Venture probabilistic programming system
http://probcomp.csail.mit.edu/venture/
GNU General Public License v3.0
28 stars 6 forks source link

Proposal: Syntax suggesting "simulate from" notation #569

Open axch opened 8 years ago

axch commented 8 years ago

e.g.

assume x ~ normal(0, 1);
assume my_normal = (foo, bar) ~> { normal(foo, bar) };

Ideas for what to do with the tilde characters?

To elaborate on Option 3 a bit:

assume x = normal(0, 1);

would make x be the distribution itself (as some fraction of our beginner users seem to expect). It would then be meaningful to write

assume y1 ~ x;
assume y2 ~ x;

and expect y1 and y2 to be different. In contrast,

assume x ~ normal(0, 1);

would make x be a sample from the standard normal distribution.

This style is traditionally (in programming languages) accompanied by making expression composition not mean "bind", so something like normal(0, 1) + 2 is (a priori) a type error -- trying to add the constant 2 to the standard normal distribution. We could choose to give such expressions meaning, for example by silently promoting 2 to the distribution "2 with probability 1" and defining + on distributions to distribute over sampling (i.e., "the distribution defined by drawing independent samples from the two arguments and adding them"). Would be analogous to what we have now, but may confuse some initiates who would expect + to be pointwise summation of density functions.

Thoughts? @vkmvkmvkmvkm @luac ?

riastradh-probcomp commented 8 years ago

Random thoughts on the colour of this bike shed:

observe normal(0, 1) ~> 42?

Does

assume theta ~ beta(alpha, alpha);
assume x = flip(theta);
observe x ~> 1;
observe x ~> 0;

look sensible?

I'm under the impression that it was an intentional design decision to represent distributions only by stochastic procedures, not by another kind of object with any sort of explicit sampling or observation operation. This is justified in an otherwise purely functional language because it doesn't really break referential transparency, whereas the inference language does have explicitly destructive operations on the model traces and hence warrants a more explicit monad with a more explicit distinction between bind and let.

Some type faces put ~ way above where it should be. Some type faces make it hard to distinguish ~ from -.

Deterministic function like standard math notation:

(x, y) |-> { x + y }

Stochastic procedures:

(alpha) ~> { theta ~ beta(alpha, alpha); bernoulli(theta) }
(x, y) ~> { z ~ normal(x, 0); w ~ normal(y, 0); return (z + w)/(z*w) }
axch commented 8 years ago

~, <~, and ~> are now aliases for =, <-, and ->, respectively, corresponding to Option 1. Does being done with this ticket consist of documenting that, or do we want to think about this more? One answer is "live with the new world order for a while and see."