stan-dev / posterior

The posterior R package
https://mc-stan.org/posterior/
Other
167 stars 24 forks source link

Add resample_draws #54

Closed avehtari closed 4 years ago

avehtari commented 4 years ago

Add resample_draws function similar to thin_draws. In thin_draws there is

sel_iterations <- seq(1, niterations, by = thin)
subset(x, iteration = sel_iterations)

resample_draws would get optional weight vector and resampling algorithm name, and sel_iterations would be determined by the resampling algorithm, e.g.

Stratified resampling is currently used in rstan and rstanarm when importance sampling is used to improve advi or optimizing.

The reference for different resampling algorithms: Kitagawa, G., Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models, Journal of Computational and Graphical Statistics, 5(1):1-25, 1996.

paul-buerkner commented 4 years ago

Good idea! The only technical problem we will need to solve is that subset (or subset_draws) can only handle true subsetting so far, that is, they cannot select the same iteration twice as would be required for some resampling methods. We may need to add this possibility to subset_draws so that elements can be selected multiple times, at least when explicitely allowed by setting a flag accordingly. Or would it be better to add another method for this purpose?

MansMeg commented 4 years ago

I could probably help out with this if you would need some help @paul-buerkner .

paul-buerkner commented 4 years ago

Sure! Thank you! Let's discuss this on monday.

Måns Magnusson notifications@github.com schrieb am So., 8. Dez. 2019, 13:41:

I could probably help out with this if you would need some help @paul-buerkner https://github.com/paul-buerkner .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/jgabry/posterior/issues/54?email_source=notifications&email_token=ADCW2AF65JI5GPP7WL2WBVDQXTMOVA5CNFSM4JXNHP72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGG4FKY#issuecomment-562938539, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCW2AHTT3WF4IP6DFUXFZDQXTMOVANCNFSM4JXNHP7Q .

avehtari commented 4 years ago

Stratified resampling can be copied from rstan/rstanarm, and I can provide reference code and some documentation text for different resampling methods.

MansMeg commented 4 years ago

Excellent! Can you give a pointer to the rstanarm code? It would be great to have it here in the issue.

avehtari commented 4 years ago

.sanple_indices https://github.com/stan-dev/rstanarm/blob/master/R/stan_glm.fit.R#L1058 and how it is used https://github.com/stan-dev/rstanarm/blob/master/R/stan_glm.fit.R#L631

MansMeg commented 4 years ago

Thanks!

paul-buerkner commented 4 years ago

Non-unqiue subsetting is now possible by setting unique = FALSE in subset_draws. For example

x <- example_draws()
# extract the first chain twice
subset_draws(x, chain = c(1, 1), unique = FALSE)

This feature should make implementing resample_draws possible @MansMeg

paul-buerkner commented 4 years ago

@MansMeg I am assigning you to this issue, as to my understanding you wanted to work on it anyway. Please tell me if you need any help.

paul-buerkner commented 4 years ago

I have added a resample_draws prototype in the resample_draws branch to provide the basis for our discussion in the next few days.

paul-buerkner commented 4 years ago

supported via PR #65