Open hyunjimoon opened 3 years ago
These are some summary of tweaks.
BS1: BS mean(#: N_theta) & prior samples rank comparison BS2: BS mean(#: N_theta) & prior distribution comparison BS3: BS(#: N_thetaN_y) samples & prior distribution comparison IJ1: assume theta ~ N_data cov(loglik_draws, param_draws) [guess not]
image.png
Check Needed.
This nips paper might be relevant @Dashadower
- Do fewer draws from the prior and lots of draws from $\tilde{y}$ for each. sensitivity around a particular $\tilde{\theta}$, while keeping $\tilde{y}$ close to its base value.
- instead of drawing a new $\tilde{y}$ from each $\tilde{\theta}$, draw one, draw an MCMC sample for $\tilde{\theta}$, and then bootstrap or do the IJ.
How we could cover the region of the data space originally mapped from hundreds of prior samples is the core question. A new rank measure would be needed as one-to-one correspondence between one prior and M posterior does not hold anymore. This measure should be a set-to-set comparison. For MCMC convergence diagnostics, Rhat, a scale reduction factor, measures the factor by which variance of the posterior could be reduced if chains were continued to run infinitely long. In analog, iterative calibration based on prior and posterior set could work; for instance
[btw prior-posterior] comparing Var(theta) with that of Var(theta').
[btw posterior-posterior] comparing recovered posteriors from subsets of priors
another approach could be to compare IJ with empirical var(theta').
Bootstrapped synthetic likelihood could prevent the repeated N times of fitting; resampling from one set of recovered parameters with some estimated variance could be used to approximate SBC_vanila's computation. The main idea is to get a better mixing over y. Instead of drawing N (at least 1000) sets of \tilde{y} from each \tilde{\theta} as in SBC_vanila, draw one MCMC sample for \tilde{\theta}, and then bootstrap and IJ. Our target is to show that rank statistic result of N sets of \theta in SBC_vanila is similar to that of N_theta Ny sets. Few draws from the prior and lots of draw from $tilde{y}$. N_theta(low) N_y(low) may be equal or smaller than N but as (d) replaces heavy fitting process, considerable speedup advantage is expected.
The size of \tilde{y}^* should be the same as the size of \tilde{y}. The statistic that is being approximated is the posterior summary statistic that you are using to compare the final output to the original prior; rank statistics in our case.
This is cheap compared to producing M independent summaries from the model, when the simulator is computationally intensive. This discusses bootstrap in ABC context which I believe could be equally applied in our setting as requirements for ABC to be feasible, SBC to have high power, and this approximation to have a chance shares the vein.
references