Add logp_score implementation to Normal, and add comments to base Distribution

alex-lew commented 1 month ago

(This is @ThomasColthurst's PR -- just moving it to this repo rather than probsys/hierarchical-irm)

alex-lew commented 1 month ago

@ThomasColthurst Thanks!

The logp method should give the density of the posterior predictive distribution over the next datapoint.

That is, in general, the setup is:

We have a prior $p(\theta; \alpha)$ over parameters $\theta$ given hyperparameters $\alpha$.
We have a likelihood $p(x \mid \theta)$ over values $x$ given parameters $\theta$.
We have a set of incorporated data values $x_1, \dots, x_n$.
Given a new $x$, logp should compute $\log p(x_{n+1}=x \mid x_1, \dots, x_n) = \log \int p(x \mid \theta) p(\theta \mid x_1, \dots, x_n; \alpha) d\theta$, the posterior predictive density.
The logp_score method should compute $\log p(x_1, \dots, x_n) = \log \int p(x_1, \dots, x_n \mid \theta) p(\theta; \alpha) d\theta$.

And so both logp and logp_score do some integration over the parameter $\theta$.

I believe this is a reference implementation of the logic that logp should use in the NIGNormal case: https://github.com/probcomp/cgpm/blob/56a481829448bddc9cdfebd42f65023287d5b7c7/src/primitives/normal.py#L166-L173

And for logp_score (your code may already do this, I haven't checked carefully): https://github.com/probcomp/cgpm/blob/56a481829448bddc9cdfebd42f65023287d5b7c7/src/primitives/normal.py#L176-L181

(Note that these implementations are based on a strategy that maintains three pieces of state: N, the sum of the incorporated xs, and the sum of their squares.)

fsaad commented 1 month ago

Minor addendum, the posterior predictive density and marginal likelihood depend on the hypeparameters $\alpha$, i.e.,:

$p(x_{n+1} = x \mid x_1, \ldots, xn) \mapsto p(x{n+1} = x \mid x_1, \ldots, x_n; \alpha)$
$p(x_1, \ldots, x_n) \mapsto p(x_1, \ldots, x_n; \alpha)$

ThomasColthurst commented 1 month ago

Thanks for the comments. I've addressed them in https://github.com/probcomp/hierarchical-irm/pull/12 , which supersedes this.

probcomp / hierarchical-irm

Add logp_score implementation to Normal, and add comments to base Distribution #5