Attribute or other way to distinguish MCMC vs MC

stan-dev / posterior

The posterior R package

https://mc-stan.org/posterior/

Other

167 stars 23 forks source link

Attribute or other way to distinguish MCMC vs MC #239

Open avehtari opened 2 years ago

avehtari commented 2 years ago

The posterior package started with focus on multi-chain MCMC and stores chain and iterations ids. These are useful when computing multi-chain Rhat, ESS, and MCSE. It is also possible to set weights for the draws which is usedul for importance sampling. It would be useful to think about the default behavior of some functions and whether the current draws objects contain sufficient information to do the right thing. For example, if we want to compute MCSE we have 4 different cases

MCMC draws (e.g. usual Stan posterior draws, use MCMC-ESS to compute MCSE)
MCMC draws with weights (e.g. Stan posterior draws + IS, e.g. in loo, use MCMC-ESS and IS-ESS)
MC draws (e.g. draws from Gaussian posterior approximation, use MC-ESS)
MC draws with weights (e.g. draws Gaussian posterior approximation + IS, use IS-ESS)

I guess we could assume that if iteration information is available, then the draws are from MCMC. But at the moment, we don't have support for independent (weighted) MC draws. Would it make sense to set the iteration to 1 for all independent draws? Other ideas for making the difference?

This issue is related to psis() function in loo package complaining if r_eff argument is not set. r_eff is used to pass the earlier computed (MCMC-ESS)/S. If we could determine whether the draws are from MCMC or MC, we would not need to complain in the latter case (and could compute r_eff internally in the first case)

paul-buerkner commented 2 years ago

Not all formats have iteration information. In fact, in a way, only draws_df has. The other ones just store iteration implicitly through the ordering of the draws.

Adding an attribute would not be difficult I think. The only question is how to proceed with it when the draws objects are transformed somehow. Attributes in R are timid things that tend to vanish into the dark as soon as one lightly touches the object they belong to.

For rvars it might be the easiest(?) to maintain an attribute as all transformations are fully custom there anywhere. For the other objects, I am not sure. I didn't want to go through the effort or reimplementing every standard transformation such as +, * etc. for every format to make sure the attributes are kept/alterted correctly.

Does anybody have other ideas how to differentiate this types of draws?

mjskay commented 2 years ago

For rvars it might be the easiest(?) to maintain an attribute as all transformations are fully custom there anywhere.

Yeah, the annoying stuff to make this work in rvars has already been figured out to track chain information, so it could certainly be done. It does seem like we wouldn't want this feature to be limited to rvars though.

For the other objects, I am not sure. I didn't want to go through the effort or reimplementing every standard transformation such as +, * etc. for every format to make sure the attributes are kept/alterted correctly.

Yeah I feel that. If this feature is desired that may end up being the only feasible way unfortunately. In the end it may not be that hard, since most of those operations can be implemented using group generics instead of one-by-one, because we aren't changing their fundamental functionality, just passing on to the superclass and then making sure the attribute is maintained on the result.

The only other mechanism I can think of is a "special" variable like the one used to store weights. Seems wasteful though since presumably it would always hold the same value for every draw.

avehtari commented 1 year ago

Now that cmdstanr is getting laplace() method to get draws from the normal approximation, it would be great to tag those draws as not being from MCMC, and by default not to show Rhat, ESS-Bulk and ESS-Tail in summarize_draws()

mjskay commented 1 year ago

In order to implement this, it might make sense to create some systematic infrastructure for resolving subtype conflicts amongst MC / MCMC / weighted MC / weighted MCMC. This would probably have to include a way for people to do coercion manually if needed, particularly if we decide some of the subtype combinations result in an error that has to be resolved by the user.

It could be helpful to fill out a table like this:

x	y	op	result
any	same as `x`	any?	same as `x`
MCMC	MC	`+`,`-`,`*`,`/`,...	MC? MCMC? error?
MCMC	MC	bind draws	MC? MCMC? error?
MCMC	weighted MCMC	`+`,`-`,`*`,`/`,...	resample `y` to MCMC, then MC or MCMC? error?
...	...	...	...

Something like a mc_subtype attribute on objects and a resolve_mc_subtype(x, y, op_type) internal method? Not sure what user-facing methods would be needed as well. Coercion from weighted to non-weighted is already handled by resample_draws(), so we might only need a new user-facing method if we decide combining MC and MCMC draws is an error so that people would have to do a coercion first (though I don't really think combining those two should be an error, since it would make the API pretty clunky in a lot of places).

avehtari commented 1 year ago

I'd fine with explicit coercion, but if automatic then, for example, couple examples are

MCMC draws of the parameters used to compute parameters of predictive distribution and sampling from that could be written in case of normal as mu + sigma*r where mu and sigma are MCMC type, but r is independent draws. Thus natural combination is MCMC.
Comparison of MCMC draws and MC draws for analysing how much difference there is in the inferences. Again auto-correlation from MCMC draws will stay there

Binding MCMC and MC seems less likely, with binding as independent chains maybe a bit more likely. I would coerce to MCMC, and MC draws would lose the information that hey were independent also over iterations.

Resampling weighted draws with some default is a non-trivial choice. I'm not what would be the use case, for combining non-weighted and weighted. For non-weighted MCMC we could assume the weights are equal. We could also consider the case of two weighted MCMC or weighted MC, but with different weights, which would make the generic math operations also complicated. For variables with equal weights there is no difference whether we do arithmetic first and resampling then or vice versa, except that the diagnostics can be better if we do arithmetic first and keep the weights.

n-kall commented 7 months ago

As @mjskay mentioned in #331, this attribute could be added to rvars. If rvars are then passed to summary functions in summarise draws, as discussed regarding weight support in #184, then summary functions could use this info too.

I think this would mostly affect mcse_* and ess_* functions as @avehtari mentions in the original post. I'm already adjusting them to handle weights, so it wouldn't be too much more to add MC vs MCMC.