Rethink point estimate interface

jeffreypullin commented 4 years ago

We need to decide on the optimal interface for extracting point estimates of the three Dawid-Skene parameters.

There are two things to consider here:

What names should be use of the parameters? Principally should we favor mathematical names i.e pi, theta, z or names based on the interpretation of the parameters i.e. prevalence probabilities, error matrices, latent class?

What should the function we use to extract the point estimates be called? Some options:

extract_{parameter name}: This is currently implemented.
- Pros: {rstan} used `extract(fit, pars = {parameter names}) and consistency with this interface was the original reason for this choice. It should be noted that the {rstan} extract function extracts posterior draws not point estimates.
  - Cons: Not 'symmetric' with plot(fit, par = {parameter name}. Doesn't describe what we are extracting i.e. no mention of posterior, point estimate etc. Inconsistent with {rstan} behavior re draws vs point estimates.
coef(): This is what is used in {brms} and {rstanarm}, the two major high level Stan interfaces.
- Pros: Consistency with {brms} and {rstanarm}
- Cons: Both {brms} and {rstanarm} fit Bayesian GLM extensions, multilevel models etc, so the models always contain coefficients in the traditional regression sense. None of the models implemented in {rater} can be easily interpreted as coefficients.
A new function: point_estimates() or similar.
- Pros: Descriptive name, no potential for confusion with other interfaces.
- Cons: How should the function handle fits using optimization vs MCMC? Both methods give point estimates, but different types of point_estimates. Maybe an argument type: could be "default" by default which would return MAP or mean depending on what was sensible, but could be specified. The function is also more verbose but that may not matter if/when we switch to the {rstantools} generics more generally.

cc @dvukcevic Thoughts?

dvukcevic commented 4 years ago

Nice summary, @jeffreypullin!

I am leaning towards the third option. Here are my thoughts on each:

Since we would like to eventually provide an interface for easy extraction of the posterior draws, I suggest we reserve extract() for that purpose, for consistency with RStan.
I agree that coef() is not the best choice here. I would only go with it if we couldn't think of a better alternative.
I like the idea of point_estimates(), especially the fact that it refers to a concept that works for both MCMC and optimisation mode.
- I would have thought there's no confusion about which point estimate it should return: for optimisation runs it can only return the MAP and for MCMC runs the only easy choice is the mean, right?
- Can we use a shorter name than point_estimates()? What about just estimates()? Is that too vague?

jeffreypullin commented 4 years ago

I'm also leaning towards point_estimates. Some more thoughts re the interface:

I (at least currently) think esimtates() alone is too vague.

At this point I think it's worth taking a step back and considering how point_estimates() would fit into the rest of the interface. My current plan is to (eventually) estimate most of the generics from {rstantools} which are listed here. For 0.2 I hope to implement:

log_lik() (maybe)
posterior_interval()
posterior_predict() (maybe)
prior_summary()

A noticeable omission from the generics on that page is a generic to return point estimates as we are discussing. I think this is because {brms} and {rstanarm} both use coef() for this purpose. In an ideal world it would be nice to tie into the posterior_* theme, but I'm not very happy with any of my ideas:

posterior_point_estimate(): too long!
posterior_mode() / posterior_mean(): descriptive but only apply to one of the fitting methods. (I guess this does at least force the user to be aware of the interpretation of the point estimate)
posterior_centre(): possible but doesn't sound quite right to me.

Actually I think point_estimates() might be the best idea after all!

dvukcevic commented 4 years ago

I agree!

One remaining question: Should it be plural (point_estimates()) or singular (point_estimate())?

Other similar functions seem to generally be singular ('interval', 'coef') so I'm inclined to follow suit. You can think of the returned value being a single point estimate of a multi-dimensional quantity rather than a large number of separate estimates, so it makes sense to use singular in that light.

jeffreypullin commented 4 years ago

Good point. point_estimate() it is! I'll implement it later today/tomorrow.

jeffreypullin commented 4 years ago

Or maybe today....

jeffreypullin commented 4 years ago

I think I should stop promising specific deadlines... Anyway, a draft PR is up now.

One question I have is what: point_estimate(fit, pars = "pi") should return. Should it be:

A vector of length K
A list of length 1 holding a vector of length K

when we have something like:

point_estimate(fit, pars = c("pi", "theta") we are forced to return a list because the parameter shapes are complex. Returning the list in the length = 1 case (i.e. 2.) would be better for consistency but requires more work from the user.

dvukcevic commented 4 years ago

I think option 2 (a list of length 1) is best. I can foresee many potential bugs otherwise!

We could potentially offer an argument such as unlist or drop as a convenience feature, but it's probably not worth the effort. A user who cares can simply do unlist(point_estimate(fit, "pi")) on their own.

jeffreypullin / rater

Rethink point estimate interface #50