Need doc about importance resampling changes

jgabry commented 5 years ago

@avehtari @bgoodri Currently there is very little doc explaining the new default behavior for optimizing (and soon VB too). It's good to have the new warning messages and the doc for the new arguments, but they don't really explain anything about what actually happens when importance_resampling is TRUE or what the diagnostics mean. Anyone using optimization will now all of a sudden start seeing new warnings and different results, so it would be good to explain it somewhere. A vignette would be great, but even something more minimal for now would be ok.

jgabry commented 5 years ago

@avehtari do you have some good example models for this? I can help with writing.

tiagocabaco commented 5 years ago

Is there a reference I could reffer to to understand the what importance_resampling is doing? Thank you

avehtari commented 5 years ago

The reference is https://arxiv.org/abs/1507.02646 Thanks for reminding about this issue. In addition to the reference we need to write a bit more documentation.

avehtari commented 5 years ago

With algorithm='optimizing' Stan first finds maximum a posteriori solution using L-BFGS algorithm. Approximate posterior draws are obtained sampling from a normal distribution centered at the mode and covariance set based on the second derivatives (Hessian) at the mode. Considering this normal approximation as importance samplign proposal distribution, we can estimate the accuracy of the approximation using Pareto-k (khat) and effective sample size (n_eff) diagnostics (Vehtari et al, 2019). Furthermore if Pareto-k diagnostic value is good (khat<0.7), stratified importance resampling (Kitagawa, 1996) with Pareto smoothed importance sampling weights (Vehtari et al, 2019) is used to provide improved set of draws which can be used as usual posterior draws with equal weights. Effective sample size (n_eff) estimate takes into account that some draws maybe repeated in the sample.

Kitagawa, G., Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models, Journal of Computational and Graphical Statistics, 5(1):1-25, 1996.

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2019). Pareto smoothed importance sampling. arXiv preprint arXiv:1507.02646

Example models with simulated big data https://avehtari.github.io/ROS-Examples/Scalability/scalability.html

EDIT: fixed the changed link

tiagocabaco commented 5 years ago

Thank you very much for getting back on this so quickly.

spinkney commented 4 years ago

With algorithm='optimizing' Stan first finds maximum a posteriori solution using L-BFGS algorithm. Approximate posterior draws are obtained sampling from a normal distribution centered at the mode and covariance set based on the second derivatives (Hessian) at the mode. Considering this normal approximation as importance samplign proposal distribution, we can estimate the accuracy of the approximation using Pareto-k (khat) and effective sample size (n_eff) diagnostics (Vehtari et al, 2019). Furthermore if Pareto-k diagnostic value is good (khat<0.7), stratified importance resampling (Kitagawa, 1996) with Pareto smoothed importance sampling weights (Vehtari et al, 2019) is used to provide improved set of draws which can be used as usual posterior draws with equal weights. Effective sample size (n_eff) estimate takes into account that some draws maybe repeated in the sample.

Kitagawa, G., Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models, Journal of Computational and Graphical Statistics, 5(1):1-25, 1996.

Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2019). Pareto smoothed importance sampling. arXiv preprint arXiv:1507.02646

Example models with simulated big data https://avehtari.github.io/RAOS-Examples/BigData/bigdata.html

The link is broken and I don't see the "BigData" application on https://github.com/avehtari/ROS-Examples. Is this still available somewhere?

avehtari commented 4 years ago

The example has moved to https://avehtari.github.io/ROS-Examples/Scalability/scalability.html (I also edited my comment above to have the new link)

stan-dev / rstanarm

Need doc about importance resampling changes #352