Adding subsampling method for PSIS-LOO-CV?

hcp4715 commented 2 years ago

Tell us about it

Dear developers,

Thanks a lot for this great package!

Recently, I am trying to apply az.loo() to hierarchical drift diffusion model, which is modeled using HDDM. However, I found that k-hat value are super high for many observations. I tried to find a solution and come across Kelter (2021), which showed that " PSIS-LOO-CV using subsampling (and posterior approximation) yields reliable ELPD estimates for model comparison when traditional methods like LOO-CV, WAIC, IS-LOO-CV or standard PSIS-LOO-CV fail." Then found that this approach had been implemented in R.

So I am wondering is there a plan to make this method available in the near future?

Thoughts on implementation

R code: https://github.com/stan-dev/loo/blob/master/R/loo_subsample.R

Reference: Kelter, R. (2021). Bayesian model selection in the M-open setting—Approximate posterior inference and subsampling for efficient large-scale leave-one-out cross-validation via the difference estimator. Journal of Mathematical Psychology, 100, 102474. https://doi.org/10.1016/j.jmp.2020.102474

hcp4715 commented 2 years ago

Dear all,

I tried to reproduce the subsampling example from the R package loo: https://cran.r-project.org/web/packages/loo/vignettes/loo2-large-data.html#approximate-loo-cv-using-psis-loo-and-subsampling

It seems that I could get similar results as in R, here is my jupyter notebook: https://gist.github.com/hcp4715/b899eb910a33040ee688000eeb9e0820

Will further improve this in the near future, because it seems that subsampling methods is useful in cognitive modelling.

Any suggestions/comments are welcome!

OriolAbril commented 2 years ago

Hi, sorry for our lack of responsiveness. I really appreciate you sharing your work here. I am a bit swamped right now but I am interested in this and I'll try to take a look

arviz-devs / arviz

Adding subsampling method for PSIS-LOO-CV? #2024

Tell us about it

Thoughts on implementation