Open turion opened 1 year ago
Another axis along which it would be very useful to extend the benchmarks: Running them for different commits. This could also be done with nix
by providing different sources for the monad-bayes
package to the benchmark.
I'm sort of ok leaving out the Anglican and WebPPL baselines, for two reasons:
Running for different commits would be helpful. Actually something that I would previously have found helpful too is a very simple thing I can run that just tells me: did the changes I commit slow down or speed up the benchmark? One can get this information indirectly via criterion, but I found it cumbersome.
At some point in the past, the benchmarks included Anglican and WebPPL. They don't anymore because it can't be expected from a normal developer to install these. It would be nice if the benchmarks were packaged as nix derivations, and the Anglican and WebPPL versions be restored as part of that. If they are implemented in nix, they're less likely to break, and can be run for everyone. In case this proves too hard, we could try to make a STAN or PyMC benchmark for comparison instead.