Closed marcoct closed 7 years ago
This applies to probabilistic programs as well as generator networks.
Potential solutions should be evaluated on at least a DPMM example with CRP and NIGN, and using both collapsed and uncollapsed versions of the sub-generators.
A CRP joint generator type (CRPJointGenerator
) with a custom trace type (CRPJointTrace
) that exposes the cluster assignments as addressable values, and internally maintains sufficient statistics, was added in this commit.
Also relevant: https://github.com/probcomp/Gen.jl/issues/59
Current plan, hopefully to be implemented this weekend:
Generator
s that are called within a program have a sub-trace and a return value. For AtomicGenerator
(generators that only expose one addressable value which is the output) these are basically one and the same, but in general they are just two types. More generally, each Generator
type is responsible for specifying the relationship (i.e. invariants) between these two data structures, and maintaining those invariants.
The sub-traces are stored in the parent trace alongside the values, when they are recorded during a call to generate!
on the parent program. Instead of getting over-written (like is currently done for unconstrained values), the sub-trace remains in trace and is mutated over successive calls to generate!
in the parent program. The parent program should guarantee that the same name always corresponds to a generator of the same consistent type across all possible program executions. Only constraints on the sub-trace (not merely recorded values) are manifested in persistent state in the sub-trace. Without any constraints on the sub-trace, the sub-trace's state is independent on successive calls to generate!
on the parent program. Constraints on elements of the sub-trace are expressed as constraints on aliases in the parent program's trace (see below).
Programs do not support automatic propagation of hierarchical addresses. A program trace is a flat map. A program may include address aliases using the syntax tag(alias => (name, subname))
, which appears inline in the program, and causes constraints (or other directives) on alias
to cause subsequent constraint (or other directive) on the name subname
in the sub-trace of the generator tagged at name
. The alias must occur before the generator tagged with name
is encountered, during all possible program executions. Aliasing allows addressing into sub-traces, but only to the degree that the various programs in the hierarchy allow it. Aliases are computed dynamically during normal control flow, which is useful for e.g. mapping data values to specific output elements of a joint generator (e.g. in a DPMM implementation, to follow shortly). Aliases are not necessary for AtomicGenerators (which most built-in primitive generators are), for which the name given to the generator is the same as the name of the value.
Names, aliases, and sub-names, are arbitrary Julia values of any type that can be the key of a Dict
.
This is still the plan. The only change since that post was written is that now the CRP and NIGN joint generators accept arbitrary keys as addresses (instead of the strict address space of increasing integer addresses that I was using before). This greatly simplifies the address aliases for the DPMM, without violating the semantics of generate!
.
There is one minor or subtle change in semantics: I was thinking that the arguments to a generator should somehow demarcate the address space of its trace. Now, with the new CRP and NIGN joint generators, the address space is all valid keys, and the argument defines which keys should be simulated during that call to generate!
. For example, I can constrain address foo
in sub-trace st
and then later call generate!(CRPJointGenerator(Set(['bar', 123]), alpha), st)
, and the st
will then contain calues for addresses bar
and 123
and the value for foo
will be retained. A subsequent call tho generate!(CRPJointGenerator(Set(), alpha), st)
would then cause the values for bar
and 123
to be removed from the sub-trace but the value for foo
would persist. The score is always based on the constrained address not the newly generated ones (I haven't implemented propose!
for these generators yet).
Wrapping up a re-rewrite to use sub-traces, and not use aliases.
Address aliasing can be introduced later as a separate nice-to-have feature. Directly applying directives like constrain!
to sub-traces, instead of simply recording them for replay later, results in a simpler mental model of what state a trace is in: e.g. a trace is a record of the program's execution, and it doesn't require a separate model for thinking about the pre-generate! (before directives have been forwarded to subtraces) and post-generate! state of a trace (after directives have been forwarded).
Propagating directives immediately to subtraces also more closely matches the approach taken in Metaprob. However, it is different from Metaprob in that sub-traces that are not AtomicTraces need to be manually added to the trace with set_subtrace!
, before addresses under that sub-trace can be touched using the trace directives like constrain!
. We require manually adding sub-traces because we allow custom trace type and we don't currently automatically infer what type a sub-trace will be. If there is a hierarchy of compound procedure applications, the user needs to add all of them before they can constrain the lowest on in the call-stack.
Note that during a call to generate!
all necessary subtraces will be added to the trace, and do not need to be added again later in order to be constrained.
I toyed with the idea of automatically generating the needed hierarchy of a default trace types (ProgramTrace
) during directive calls, and then replacing them with the actual trace types (and copying over constraints, etc.) at run-time, but this seems too complex for this early stage, and against the working philosophy of Gen, which is, I think:
"The user should have the ability to customize the system at a low-level, and should expect to do a lot of low-level programming at first. Gradually, abstractions will be built into the system, but only after lower-level APIs are used, and the need for certain abstraction is clear"
The new ProgramTrace
implementation uses subtraces.
Currently, the only compositional generator is a
ProbabilisticProgram
, and for these the only sub-regenerators are currentlyAtomicGenerator
s (i.e. "probabilistic modules"), and the value itself is stored in the program's trace instead of the (AtomicTrace
) sub-trace. Note thatProbabilisticProgram
can easily be wrapped in to aAtomicGenerator
.Although "probabilistic modules" are appealing because they have transient sub-traces, which are completely hidden from the outside world, it may be useful to have persistent sub-traces. This seems useful for collapsed generators that can maintain sufficient statistics (e.g.
CRPState
andNIGNState
). Currently this state is manually added to the trace using a separatetag
instruction. It would be better to have a new generator type whose trace type stores the sufficient statistics alongside the actual values. We would then need to actually store the sub-traces of sub-generators, and not just the value when it exists as is done currently.Persistent sub-traces also allow for more flexible inference programming that breaks through the procedure abstraction boundary, although I'm not sure this is actually a good thing or not. Perhaps there is a form of 'disciplined probabilistic programming' in which model programs are written so that the model procedural abstraction can be respected by the inference program. This seems possible in some cases, and boils down to the model program exposing all latent variables of domain interest or inference interest at the top level.