stan-dev / docs

Documentation for the Stan language and CmdStan
https://mc-stan.org/docs/
Other
37 stars 107 forks source link

document normal sufficient stats in User's Guide efficiency chapter #770

Closed bob-carpenter closed 3 months ago

bob-carpenter commented 3 months ago

Summary:

We should talk about how to replace a normal over a data set with sufficient statistics.

Description:

These models induce the same posteriors over mu and sigma. You can swap out the prior for something else.

normal.stan

data {
  int N;
  vector[N] y;
}
parameters {
  real mu;
  real<lower=0> sigma;
}
model {
  y ~ normal(mu, sigma);
}

normal-suff.stan

data {
  int N;
  vector[N] y;
}
transformed data {
  real mean_y = mean(y);
  real<lower=0> var_y = variance(y);
  real nm1_over2 = (N - 1) / 2.0;
  real sqrt_N = sqrt(N);
}
parameters {
  real mu;
  real<lower=0> sigma;
}
model {
  mean_y ~ normal(mu, sigma / sqrt_N);
  var_y ~ gamma(nm1_over2, nm1_over2 / sigma^2);
}

And you can add priors for mu and sigma and still get the same results. Here's some simulation code in Python using cmdstanpy:

import numpy as np
import cmdstanpy as csp

N = 1000
mu = -1.3
sigma = 3.2
y = np.random.normal(size=1000, loc=mu, scale=sigma)
data = {'N': N, 'y': y}

for path in ['normal.stan', 'normal-suff.stan']:
    print(f"{path=}")
    model = csp.CmdStanModel(stan_file = path)
    fit = model.sample(data = data)
    print(fit.summary())

Additional Information:

This originated from a forum discussion:

https://discourse.mc-stan.org/t/fitting-a-simple-normal-model-in-stan-conditional-on-sufficient-statistics-jeffreys-prior/34810/7

Current Version:

v2.18.0

WardBrian commented 3 months ago

772