Tests of samplers for built-in primitives

Related: https://github.com/probcomp/cgpm/issues/14

Related: https://github.com/posterior/goftests

Thoughts:

The label Research is indeed an appropriate one, and very interesting! The statistical problem this ticket touches upon is "goodness-of-fit" testing: are samples X drawn from distribution F? The majority of frequentist goodness-of-fit tests in the continuous setting, Kolmogorov-Smirnov, Cramer-von Mises, Anderson-Darling, are (i) univariate, and (ii) based on the CDF representation of the distribution. While the literature on non-parametric techniques for GOF testing is vast (can refer to references if interested, including recent NPB techniques), I do not know of any other methods which are routinely used as statistical tests in related fields.

Here is one idea for the univariate setting, using the differential entropy as a test-statistic:

Generate i.i.d samples X1, X2, ... Xn ~ \simulate();
Estimate the differential entropy H(X) = \sum_{i=1}^{n}-\logpdf(xi)) using Monte Carlo;
Compute the entropy "exactly" using quadrature;
If \simulate agrees with \logpdf, the MC entropy should approach the quadrature truth entropy.

It should be possible to compute large sample error bounds of the MC estimator, and obtain confidence intervals.

probcomp / GenExperimental.jl