probprog / anglican-infcomp

Clojure library for Inference Compilation
https://probprog.github.io/inference-compilation
GNU General Public License v3.0
6 stars 1 forks source link

"Lightweight implementation" style addressing in Anglican #1

Open dtolpin opened 7 years ago

dtolpin commented 7 years ago

As we discussed, a good way to implement an alternative addressing is Anglican is to write preprocessing before CPS transformation and explicitly compute sample and observe labels. It does not make much sense to process addressing after the CPS transformation because most of the source code structure is lost.

To do this, alternative implementations of fm, defm, query, defquery must be provided, where the source code of Anglican is first transformed so that sample and observe addresses are computed to scopes, and then the transformed program is passed to CPS. I believe that the best way to structure this would be to just provide versions of these macros in a separate namespace; for example anglican.infcomp, infcomp.anglican.emit, or anglican.infcomp.emit.

Address-adding transform is described in the 'Ligtweight implementation ...' paper and in the DIPPL book . You may choose to add as much structural information to the address as you like and do experiments to see whether structural information improves learning.

@fwood, @gbaydin

dtolpin commented 7 years ago

I looked at the code of dippl and the original paper. This reminded me that there is an issue with the addressing scheme there: different checkpoints may have the same address in the same stack location:

(fn [x]
  (let [foo (fn [x] (sample (normal 0 x))
         bar (fn [x] (sample (normal x 1))
         fun (if (> x 1) foo bar)]
     (fun x))

both sample statements will get the same address when invoked. However they should not be reused. Same about observe.

Of course, the current addressing scheme can also be fooled by passing different distributions as parameters to sample:

(fn [x]
   (sample (if (> x 1) (normal 0 x) (normal x 1))))

(which will not work well with LI scheme anyway). However, there is a different level of ambiguity here, much more obvious to the model writer than in the above example. A solution for that would be to combine structural information AND a unique identifier (allocated deterministically) for each sample. Of course, these are edge cases, but software is judged by edge cases --- rare failures bear a high regret.

lezcano commented 7 years ago

I don't know the details of the particular implementation, but you might want to implement something similar to the behaviour of the addresses generated with backtracehttp://man7.org/linux/man-pages/man3/backtrace.3.html. Backtrace uses (not quite, but almost) the offsets of the function calls with respect to the function entry point in the compiled code. I think that this would solve this problem.

dtolpin commented 7 years ago

That's perfect for a language like C++ but not for a functional language. In a functional language functions (actually, closures) are computed on the fly rather than compiled'. So the samefunction' may close on different environments, and this is exactly the reason why lightweight transformation addressing is easy to fool in a functional language.