Open mjskay opened 3 years ago
I agree. At least for brms, this is the plan: https://github.com/paul-buerkner/brms/issues/1204
Yeah I think this is a good plan.
Awesome! I'll close this and open issues on rstanarm and cmdstanr. as_draws_XXX()
appears to already work on rstan::stanfit
from my quick testing, as stanfit objects appear to mimic arrays so it gets routed appropriately through as_draws_array()
I think there may be some more general things to discuss so I am reopening this isuse. I hope that is ok.
1) Shall we add additional arguments to as_draws_XXX.somefit
to mimic existing applications people are used to from as.array
and friends. Specifically:
- Shall we add a `variable` argument to preselect variables to be extract instead of extracing all, effectively mimicing the `pars` argument of `as.array.stanfit`?
- How to handle warmup samples? Shall we add an `inc_warmup` (or similarily named) argument to `as_draws_XXX.somefit`?
- What should the default format via `as_draws` be for brms and rstanarm models be? The apparent behavior of `stanfit` suggest we should use `draws_array`. Is that a choice we like?
2) What shall we do with the existing as.array.somefit
(and friends) of brms or rstanarm for example? Shall we keep them as is? Or shall we deprecate them and say that they will eventually become aliases of the corresponding as_draws_xxx
methods?
3) What shall niterations
and ndraws
count? I assume the post-warmup iterations / draws, that is, ignoring warmup draws?
4) Shall we add some generics for nwarmup
and nthin
(or nthinning
?) to posterior?
- Shall we add additional arguments to
as_draws_XXX.somefit
to mimic existing applications people are used to fromas.array
and friends. Specifically:
- Shall we add a
variable
argument to preselect variables to be extract instead of extracing all, effectively mimicing thepars
argument ofas.array.stanfit
?- How to handle warmup samples? Shall we add an
inc_warmup
(or similarily named) argument toas_draws_XXX.somefit
?
These both sound like good ideas. Are you suggesting they should be in the generic or just the specific implementations? There's some benefit to standardization enforced by putting them in the generic but also they wouldn't apply in all cases.
- What should the default format via
as_draws
be for brms and rstanarm models be? The apparent behavior ofstanfit
suggest we should usedraws_array
. Is that a choice we like?
I can only say from a user perspective for those packages, but my suggestion would be whatever format is fastest to convert the internal format to. I think the stanfit
behavior might just be an accident of the fact that is.array()
on a stanfit returns TRUE
so as.array()
gets call inside posterior to do the conversion?
- What shall we do with the existing
as.array.somefit
(and friends) of brms or rstanarm for example? Shall we keep them as is? Or shall we deprecate them and say that they will eventually become aliases of the correspondingas_draws_xxx
methods?
Again speaking as a user, I think as.array()
is nice since it gives interoperability with packages that know how to do things with arrays via standard coercion functions. So I wouldn't remove it, but maybe change examples to use/advertise as_draws_XXX instead?
- What shall
niterations
andndraws
count? I assume the post-warmup iterations / draws, that is, ignoring warmup draws?
Do you mean niterations / ndraws applied to fit objects? Seems reasonable.
- Shall we add some generics for
nwarmup
andnthin
(ornthinning
?) to posterior?
Again I'm assuming you mean for fit objects... would this be metadata that gets included with a draws object that we keep around, or functions that are only applied to fit objects? I suppose we could keep track of this metadata if it were valuable to do so, though I'm not sure its worth it.
Thank you for your feedback.
These both sound like good ideas. Are you suggesting they should be in the generic or just the specific implementations? There's some benefit to standardization enforced by putting them in the generic but also they wouldn't apply in all cases.
I would not want to add these arguments to the generics themselves, but perhaps add to their docs that we would recommend arguments being named in a certain way, if a method wants to support them.
I can only say from a user perspective for those packages, but my suggestion would be whatever format is fastest to convert the internal format to. I think the stanfit behavior might just be an accident of the fact that is.array() on a stanfit returns TRUE so as.array() gets call inside posterior to do the conversion?
Then draws_list
it is I think since rstan stores draws in this list like format. In fact, the draws_list
format was added keeping specifically rstan in mind.
Again speaking as a user, I think as.array() is nice since it gives interoperability with packages that know how to do things with arrays via standard coercion functions. So I wouldn't remove it, but maybe change examples to use/advertise as_draws_XXX instead?
I agree. I will leave as.array.brmsfit
as is from the user side but let posterior handle the backend. The returned object will still be an array
(not a draws_array
) as to work consistently with methods expecting the default dropping behavior of arrays.
Do you mean niterations / ndraws applied to fit objects? Seems reasonable.
Yes, exactly.
Again I'm assuming you mean for fit objects... would this be metadata that gets included with a draws object that we keep around, or functions that are only applied to fit objects? I suppose we could keep track of this metadata if it were valuable to do so, though I'm not sure its worth it.
I simply mean generics to be usable on fit objects, otherwise returning 0 warmup and 1 thinning. The goal here would only be to align the function names, not to support warmup draws in the posterior draws formats.
While working on supporting posterior in brms, I stumbled upon another question which warrants some discussion I think. In brms, I allow to extract subsets of draws (via a badely named argument called subset
). Of course, we can just chain one as_draws
and then an subset_draws
but I was wondering if it could be sensible to allow the subsetting arguments such as draw
, iteration
and chain
already in as_draws.brmsfit
similar to allowing argument variable
as mentioned above?
For brms, the advantage would be that as.data.frame
and friends can work with one single call. Since they remove the posterior classes for consistent behavior, doing a subset_draws
after extraction won't work. So we either have to do it in a single call, build some other workaround, or simply require staying in the posterior-verse of functions and objects. The latter may not be ideal since even brms itself requires (for the time being at least) usual data.frames etc. and cannot stay in the posterior-verse consistently yet.
Just commenting here to mention that we now (finally, sorry for the delay!) have as_draws()
support for CmdStanR models:
https://github.com/stan-dev/cmdstanr/pull/581
Because as_draws(...)
is just a wrapper around fit$draws(...)
I didn't explicitly add any other arguments to as_draws()
because ...
can be used to pass any supported arguments (e.g. inc_warmup
) to fit$draws()
. I think this makes sense for CmdStanR, but I could also add all the supported arguments to the signature of as_draws()
method if everyone else feels strongly about that.
Just a note that as_draws
and friends are supported in brms since some time now.
It seems like the other packages now have as_draws, so should this issue have a new title to describe the remaining tasks?
I guess so. From my side, we can also close this issue as I think we have made enough progress in the relevant downstream packages(?) @jgabry how far is rstanarm with the posterior incorporation?
I added as_draws() methods to rstanarm but that’s about it. There are other places in rstanarm where posterior could be used, but nothing urgent I think, it would just be internal stuff unless I’m forgetting something.
I agree. I would vote to close this issue. Any objections?
(Note: I'm opening this here for discussion first rather than opening a bunch of essentially duplicate issues in several packages)
Now that posterior is on CRAN, would it make sense to have
as_draws()
/as_draws_XXX()
implementations for objects in brms, rstanarm, rstan, cmdstanr, etc? I think it would help fulfill part of the promise of {posterior}; that is, users don't have to remember what particular function for a given model type gets them draws in what format, they can just chuck the object intoas_draws_XXX()
for whatever format they want.E.g. with cmdstanr, rather than doing this:
People could (optionally) do:
The advantage of the latter code is that as more packages also implement these generics it becomes easier for users to move between packages. @paul-buerkner @jgabry thoughts? I dunno if this was already the plan anyway.