piantado / LOTlib3

Language of thought library for python 3
43 stars 6 forks source link

one questions about primitives #2

Open gureckis opened 4 years ago

gureckis commented 4 years ago

hi, great library. asking a question here in case it is useful to someone later. I'm not entirely clear on the advantages of using a string versus a LOTLib-provided primitive. For instance, both of these should work the same from the user/modeler perspective:

grammar.add_rule('EXPR', '(%s + %s)', ['EXPR', 'EXPR'], 1.0)

and

grammar.add_rule('EXPR', 'plus_', ['EXPR', 'EXPR'], 1.0)

with the later using the built-in 'plus_' primitive. I understand the registering new primitives is critical for adding custom functionality but unclear why LOTlib3 goes so far as to provide primitive things like addition if they would be interpreted correctly by the python interpreter just as a string.

thanks!

piantado commented 4 years ago

Oh thanks, yes these are the same. Note that you might generally prefer the function call version even for most arithmetic operations because you might want to do something special for e.g. division by zero, etc. For instance, divide_ in Primitives.Arithmetic returns (signed) infinity for divide by zero instead of throwing an exception. (LOTlib used to do some tracking of number of primitive calls, which would work for @primitive but not the "%s + %s" version, but it looks like that's not up to date -- I will put it on the list of things to fix!)

gureckis commented 4 years ago

makes sense thanks! what happens if a primitive has a random element to its return value (like it calls a random number generator can executes some conditional logic)? does LOTLib support that kind of code (I noticed you specifically mention Fleet does so wondering if that is a hard limitation of LOTLib or simply more convenient in Fleet)?

piantado commented 4 years ago

Yes that's exactly the problem that led to Fleet: https://github.com/piantado/Fleet . You can do it in python (we have some old code if you're interested) but it's pretty slow. One way to do it is to make a likelihood that runs a stochastic function many times; this is incredibly inefficient and leads to a poor approximation to the likelihood in any interesting case. Another way to do it is to try to pass a stochastic state to the hypothesis (so for instance, you pass in the vectors [0], [1], [0,0], [0,1], [1,0], ... and set up your flips() so that they read these vectors deterministically -- there are even lazy ways of enumerating these. In this way, you can enumerate the entire set of execution paths (some church implementations do things like this)). The fundamental problem with passing around internal stochastic states like that is that at least in the python that I can figure out, it's very hard to store the partial evaluation of a function. So when you switch to a new stochastic vector like that, you re-evaluate from the start. If it would be helpful, I can dig up some old LOTlib2 code that does some of that but I really don't recommend using it :)

The alternative to this is Fleet, where each hypothesis gets mapped to an actual program that runs on a little virtual machine (this machine is programmed automatically for you using C++ templates, based on the grammar you specify). The entire state of this virtual machine can be copied and even stored. Fleet stores a priority queue of partial stochastic hypothesis evaluations, and runs them in order to find the high-probability evaluation traces for programs. Fleet follows LOTlib kinda closely in terms of structuring a grammar and hypotheses, so the hope is that people need stochastic primitives or speed, it will be easy to switch. This, for instance, is the rational rules implementation: https://github.com/piantado/Fleet/blob/master/Models/RationalRules/Main.cpp