aesara-devs / aemcmc

AeMCMC is a Python library that automates the construction of samplers for Aesara graphs representing statistical models.
https://aemcmc.readthedocs.io/en/latest/
MIT License
39 stars 11 forks source link

Design robust calibration tests for samplers #30

Open rlouf opened 2 years ago

rlouf commented 2 years ago

aemcmc currently does not implement calibration tests for its samplers (i.e. it does not check that the samplers generate samples from the correct distribution), while aehmc's tests are very flaky and can be made to pass or fail with different RNG seeds. While this is still a very open area of research, making sure that the samplers that aemcmc build generate correct samples is critical.

This blog post summarizes the situation fairly well and links to the relevant literature.

My current understanding from the literature is that existing solutions are hypothesis tests which test hypotheses that we don't expect to be true. There seems to be (unsurprisingly) little hope that we can devise tests that require absolutely no supervision. However, there are a few things we could do to automate some of Gelman's suggestions in the post above, and the general idea would be to have some rough tests in CI to avoid obvious miscalibration (check that the chain has moved on a simple problem, check the monte carlo error, etc. )and give the users some tools to evaluate the calibration of the sampler on their use case.