Open jgabry opened 5 years ago
@avehtari do you have some good example models for this? I can help with writing.
Is there a reference I could reffer to to understand the what importance_resampling is doing? Thank you
The reference is https://arxiv.org/abs/1507.02646 Thanks for reminding about this issue. In addition to the reference we need to write a bit more documentation.
With algorithm='optimizing' Stan first finds maximum a posteriori solution using L-BFGS algorithm. Approximate posterior draws are obtained sampling from a normal distribution centered at the mode and covariance set based on the second derivatives (Hessian) at the mode. Considering this normal approximation as importance samplign proposal distribution, we can estimate the accuracy of the approximation using Pareto-k (khat) and effective sample size (n_eff) diagnostics (Vehtari et al, 2019). Furthermore if Pareto-k diagnostic value is good (khat<0.7), stratified importance resampling (Kitagawa, 1996) with Pareto smoothed importance sampling weights (Vehtari et al, 2019) is used to provide improved set of draws which can be used as usual posterior draws with equal weights. Effective sample size (n_eff) estimate takes into account that some draws maybe repeated in the sample.
Kitagawa, G., Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models, Journal of Computational and Graphical Statistics, 5(1):1-25, 1996.
Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2019). Pareto smoothed importance sampling. arXiv preprint arXiv:1507.02646
Example models with simulated big data https://avehtari.github.io/ROS-Examples/Scalability/scalability.html
EDIT: fixed the changed link
Thank you very much for getting back on this so quickly.
With algorithm='optimizing' Stan first finds maximum a posteriori solution using L-BFGS algorithm. Approximate posterior draws are obtained sampling from a normal distribution centered at the mode and covariance set based on the second derivatives (Hessian) at the mode. Considering this normal approximation as importance samplign proposal distribution, we can estimate the accuracy of the approximation using Pareto-k (khat) and effective sample size (n_eff) diagnostics (Vehtari et al, 2019). Furthermore if Pareto-k diagnostic value is good (khat<0.7), stratified importance resampling (Kitagawa, 1996) with Pareto smoothed importance sampling weights (Vehtari et al, 2019) is used to provide improved set of draws which can be used as usual posterior draws with equal weights. Effective sample size (n_eff) estimate takes into account that some draws maybe repeated in the sample.
Kitagawa, G., Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models, Journal of Computational and Graphical Statistics, 5(1):1-25, 1996.
Vehtari, A., Simpson, D., Gelman, A., Yao, Y., and Gabry, J. (2019). Pareto smoothed importance sampling. arXiv preprint arXiv:1507.02646
Example models with simulated big data https://avehtari.github.io/RAOS-Examples/BigData/bigdata.html
The link is broken and I don't see the "BigData" application on https://github.com/avehtari/ROS-Examples. Is this still available somewhere?
The example has moved to https://avehtari.github.io/ROS-Examples/Scalability/scalability.html (I also edited my comment above to have the new link)
@avehtari @bgoodri Currently there is very little doc explaining the new default behavior for optimizing (and soon VB too). It's good to have the new warning messages and the doc for the new arguments, but they don't really explain anything about what actually happens when importance_resampling is TRUE or what the diagnostics mean. Anyone using optimization will now all of a sudden start seeing new warnings and different results, so it would be good to explain it somewhere. A vignette would be great, but even something more minimal for now would be ok.