tweag / monad-bayes

A library for probabilistic programming in Haskell.
MIT License
407 stars 62 forks source link

Free list transformer to replace deprecated ListT #253

Open turion opened 1 year ago

turion commented 1 year ago

samples_free_listt.pdf samples_master.pdf

Removes the dependency on the deprecated ListT in favour of a free monad construction. In fact improves benchmarks significantly!

turion commented 1 year ago

I observed runtimes to go down by 30%, and never going up in comparison to master. This is probably because the free monad over an applicative only adds free layers whenever it really needs, but not for fmap and <*>. I didn't test whether a free monad over the list functor gives a similar speedup, but it has no advantage I'm aware of, so I didn't

Adding ApplicativeDo globally also had a slight positive effect, but that's a separate issue.

turion commented 1 year ago

My one request would be an update to the documentation here: https://monad-bayes.netlify.app/usage/#population

Yes, on it.

but would be nice to check if it's straightforward for you to update the docs.

Very good point.

I have trouble understanding the current documentation:

Population m a ~ m [Log Double -> (a, Log Double)]

Shouldn't it be:

Population m a ~ m [(a, Log Double)]
reubenharry commented 1 year ago
newtype Population m a = Population (Weighted (ListT m) a)

Weighted is the State monad. (By the way, it's possible that one of the optimized Writer or accum monads could be used here, but that's for a separate issue).

ohad commented 1 year ago

Nice! Since this implementation is computing things differently, not just faster, we should at least try to check that it's producing the correct results, not just producing results faster.

turion commented 1 year ago

Weighted is the State monad. (By the way, it's possible that one of the optimized Writer or accum monads could be used here, but that's for a separate issue).

Ah, true. I guess it's implemented as State for efficiency reasons, but its interface only allows Writer operations. Ironically in the ListT situation that's an important distinction because State is not commutative, but Writer for a commutative monoid like Log Double is.

I wonder what we should do documentation-wise. I'd find it simpler to explain that Weighted is a Writer, so we can't arbitrarily change the weight of a path, but only adding to its log. But that's a separate issue.

ohad commented 1 year ago

We should understand what the proposed change is doing a bit more before merging. I wrote it in issue #245, but it also belongs here.

The paper and its follow-up used the monad-structure transformer ListT because the computation it produces makes the correct samples and in the correct order. @adscib (and later the Tweag team) have tested that the resulting inference algorithms produces the expected distributions when run on models, and the paper proves the algorithms, when working on these data structures, are correct.

In general, free monads may do something different, for example, suspend random samples that some of the monad-bayes combinators do in crucial places. This might mean, for example, that using the same combinations of monad-bayes's building blocks to implement algorithms such as smc, smc^2 etc. are not longer correct.

I suggest we first understand better how this proposed change affects the existing inference algorithms, not just in runtime, but in the actual samples they compute, both by testing and using an accompanying formalisation.

reubenharry commented 1 year ago

Ah, true. I guess it's implemented as State for efficiency reasons, but its interface only allows Writer operations.

That's right. I'm still surprised that there isn't a Writer implementation that is suitably efficient (or even just uses State under the hood), but I don't know enough about performance to solve that problem. (I tried the continuized Writer, and I recall seeing mild speedups or at least, no slowdowns, although I might be misremembering).

I wonder what we should do documentation-wise. I'd find it simpler to explain that Weighted is a Writer, so we can't arbitrarily change the weight of a path, but only adding to its log. But that's a separate issue.

I agree in theory, but I think this would just cause extra confusion when anyone looked at the code and saw State, not Writer.

turion commented 8 months ago

I suggest we first understand better how this proposed change affects the existing inference algorithms, not just in runtime, but in the actual samples they compute, both by testing and using an accompanying formalisation.

I tried to go for a conservative approach now that keeps an Applicative PopulationT transformer which has the same semantics as before, and is used in all places where needed. Unfortunately it isn't clear to me where to appropriately flatten the free population trees to make the SMC2 algorithm work. I tried all places I could think of, but ultimately didn't succeed in fixing the corresponding test. (On the other hand, I don't know whether SMC2 was correct to begin with.)

One learning is definitely that using an unlawful monad is a maintenance nightmare. Replacing it with a lawful one and then trying to find all the places where the particular flattening behaviour was intended and where it was accidental is like looking for a needle in a haystack. If someone else wants to give this a go, I'm happy to hand over and advise, but I've done enough work on this now for free.

Another thought I had while working on this issue: SequentialT is used pretty much like a free monad transformer. But the disadvantage is that it is defined using a library that is not very well known and popular. Possibly, we should do away with it completely and just use free monads instead. This might clarify the PopulationT story.

reubenharry commented 8 months ago

I also had that thought re. SequentialT, and attempted to replace with FreeT and a catamorphism, but discovered that my understanding was not good, because the "folding" in the coroutine operations was quite different to what I'd imagined. You might have more luck though.