stan-dev / posteriordb

Database with posteriors of interest for Bayesian inference
163 stars 26 forks source link

Examples of using posteriordb to check a posterior, extend the local pdb #179

Open rok-cesnovar opened 4 years ago

rok-cesnovar commented 4 years ago

Hi @MansMeg,

we are looking into using posterior db for regression testing (both numerical accuracy as well as speed testing) for the math/stan/cmdstan/stanc3 repository pipeline. There are a bunch of stuff we need to do before that, so the actual implementation of this is probably months away, but I wanted to start the discussion early.

A few questions if you can help me or just point me to any docs I may have missed:

MansMeg commented 4 years ago
  1. It can be done in multiple ways. I don't have any documentation for it, but can add it if you like. The easiest is just to clone the repo, add models, and use the local version. A question, why do you want to keep them locally? I would be happy to add them as long as they have stan code and data.

  2. Currently not. The reason is that it is not clear exactly how to do that in the best way. If you have a metric or idea on a method to compare two multivariate distributions, then that can be done, no problem.

rok-cesnovar commented 4 years ago
  1. A question, why do you want to keep them locally?

Will definitely add those that make sense.

For some, it would make sense to generate a dataset on the fly (generate from a script and save to JSON) run it with a "verified" version, the last official release for example, add it locally to posteriordb as the "gold standard" and then run the validation with the actual posteriordb models and the locally added one in a unified setting.

If we want to test a fairly large logistic regression that would be too huge of a file for a Github repo. And this is probably out of the typical use-case/scope of posteriordb.

2. If you have a metric or idea on a method to compare two multivariate distributions

Not at the moment. I am also not the best person to suggest a metric/method. That is something that will have to be hashed out once we start implementing this. With input from smarter people than me of course. If you have thoughts/ideas on how best to do this, I am happy to take any suggestions.

MansMeg commented 4 years ago

Alright, that makes sense.

  1. There are a couple of alternatives. I think Maximum Mean Discepancy should be a good start, but I'm not sure since I have not yet dug into the literature.