probcomp / GenExperimental.jl

Featherweight embedded probabilistic programming language and compositional inference programming library
MIT License
17 stars 2 forks source link

Module networks #49

Open marcoct opened 7 years ago

marcoct commented 7 years ago
marcoct commented 7 years ago

When @generate(trace, program(args)) operates, the current interface requires that all visited tagged expressions be recorded. This is used for trace rendering, and for extracting domain-relevant latent state. The current interface does not have a means of expressing a subset of expressions to be recorded. However, the current interface does have a means of expressing which subset of expressions should be scored (the subset of tagged expressions that are marked with constrain! or propose!).

marcoct commented 7 years ago

The semantics of @generate(trace, network(args)) should be the same as that of generating a probabilistic program. That is, any variables that are not intervened or constrained should be generated. The typical use case is that most of the trace is filled, and only one or a few variables require generation. The variables requiring generation (the complement of variables that are either intervened or constrained) should be tracked inside the trace data structure (the "regeneration set"), to avoid having to do a linear scan to determine which variables require generation. Each call to delete!that removes a constraint or intervention will then add the element to the regeneration set in constant time. Each call to constrain! or intervene! will remove a variable from the regeneration set in constant time. During a call to @generate(trace, network(args)) only the variables in the regeneration set will be visited.

marcoct commented 7 years ago

It seems that an ideal version of this would simply be an extension to probabilistic program definition syntax that adds the ability to annotate dependencies between tagged variables. One lightweight way of doing this would be to allow expressions to be retrieved from the trace during probabilistic program execution.

For example, the dependencies in the following block:

a = foo(x)
b = bar(a)
c = baz(x, a, b)

could be defined in the probabilistic program using the syntax:

a = @tag(foo(x), "a")
b = @tag(bar(@get("a")), "b")
c = @tag(baz(x, @get("a"), @get("b")), "c")

Then, it should be possible to construct the dependency graph on the first call to @generate. However, implementing the ability to only execute certain expressions seems to require a lot more machinery that we don't have in the "lightweight probabilistic programming" design. Concretely, the question is:

"Is there an implementation of partial regeneration of a Julia program with a user-defined dependency graph that doesn't involve writing an interpreter for all of Julia?"

marcoct commented 7 years ago

Alternatively, we could use a special syntax just for defining module networks, e.g.:

Syntax:

my_network = @network (alpha::Float64) begin  
    (a, b, c) = foo(alpha)
    d = bar(a, alpha)
    f = baz(a, d)
end

where the first line can alternatively be @network my_network(alpha::Float64) begin

The network is a basic block program where each line is of one of the following forms:

  1. x = foo(alpha, b) where x is a symbol not previously defined in the block, and where foo is the name of a registered module, and where alpha and b are symbols that are either parameters of the network or previously defined symbols.

  2. x, y, z = bar(alpha, x) where the left-hand-side is a tuple of symbols that have not been previously defined.

  3. Maybe fixed-length for loops could be included as well.

The network should look like a Julia function. The network can take arbitrary parameters just like a regular Julia function. The network knows the complete set of named variables. Running @generate(trace, network) should behave just like generating a probabilistic program.

Unless a surprisingly easy and clean way implementing of the first suggestion using @get appears, I am leaning towards this version.

There also might be benefits to implementing this simplified specialty version---it may be easier to distribute or parallelize the computation if we use a very restricted computing model. That is, there are some uses for "data-flow" programming as a thing.

marcoct commented 7 years ago

Arbitrary Julia code (Function) can be nodes in the network. That node simply throws an error if it is constrained or proposed.

marcoct commented 6 years ago

Latest thinking:

A 'module network' is just a probabilistic program, but where generate! has been overloaded manually by the author of the network to be more efficient than the default 'lightweight' implementation of generate! for the probabilistic program. In particular, a 'module network' will use the same underlying trace data structure, and will be defined using the same language (i.e. the Gen embedded probabilistic programming language), as a regular probabilistic program. In fact, initially, there need be no separate 'module network' type---the author of a probabilistic program will be able to overload generate! for that particular program.

As we write more models, we will gradually introduce abstraction that make overloading generate! for probabilistic programs require less new code. An example of the type of abstraction that could plausibly be useful: Allow definition of 'rules' (conditions on the trace) that trigger a specific custom generate! implementations, or fall back to the default generate! implementation if none of the rules are relevant.

Starting with this open design avoids attempting to solve a general incremental computation problem, but allows room for introducing more generally-applicable incremental computation mechanisms in the future if or when they become apparent for particular recurring patterns that appear in models.

Actually, it seems wise to permit the custom implementations of generate! to retain their own state. That state would not be accessible through the trace interface. Perhaps the network author can define a custom trace type that delegates to a standard probabilistic program trace for all trace queries, but has arbitrary side-state that the custom generate! implementation can make use of and mutate in order to be efficient.

marcoct commented 6 years ago

Of relevance:

In 20170713-marcoct-generators, and in preparation for AIDE, I implemented generator combinators for several of the original module network concepts:

marcoct commented 6 years ago

Started implementation of module networks (now called 'Generator networks') feature in: https://github.com/probcomp/Gen.jl/blob/20170804-marcoct-generators/src/dag.jl https://github.com/probcomp/Gen.jl/blob/20170804-marcoct-generators/src/network.jl

A Generator network is a Generator type.

The network represents a static dependency graph between generator invocations as a DAG. To respond to a given query, the network walks back from the output nodes to identify the set of sub-generators that need to be recursively queried, stopping when it reaches condition addresses.

The auxiliary structure that will be sampled for a given query is defined as the minimum auxiliary structure needed to produce inputs for all of the outputs. The auxiliary sampling family is forward simulation. All of the addressable auxiliary structure is recorded in the trace (i.e. everything except that which sub-generators don't expose).

For example, in an HMM generator network, the if we condition on some early hidden state Z10, and request observable state X20 as output (either with regenerate! in which case we provide a value for X20 or in simulate! in which case the value for X20 will be sampled for us), the HMM will generate auxiliary structure consisting of hidden states Z11 through Z20, and populate these into the trace.

One remaining problem is that the semantics of regenerate! and simulate! require that forward sampling over the auxiliary and/or output variables, corresponds to conditional sampling given the condition. This is not true, if for example, we condition on X11 and request Z5 as output. The network should somehow check and report that it cannot respond to such a query. Note that setting Z5 and X11 as two output variables is acceptable---both will be scored in a call to regenerate!. The problem is specifically with the 'condition' addresses.