Closed bwiernik closed 3 years ago
Rather than changing the API (which at that point would be quite devastating), what about we just assume that our "CI" stands for (un)Certainty Interval 😬
I would consider CI
as quite generic when it comes to model summaries (see also https://www.bmj.com/content/366/bmj.l5381 - you could name it credible interval or compatibility interval - the latter works well for both Bayesian and frequentist).
For model_parameters()
, you get a message, which type of interval is computed (HDI, ETI/CI). And in general, the ci_method
is stored as attribute when using describe_posterior()
or model_parameters()
.
It's a bit different when we are talking about intervals for predictions. Here we could indeed use CI or PI (and @mattansb suggested this somewhere, calling the prediction interval columns PI
). That would be possible and not result in a larger rewrite/modification of lots of package code pieces.
Bayes CIs = values the estimate probably is. Freq CIs = values the estimate is not probably not (😒)
But don't PIs have a similar interpretation for Freq and Bayes? I haven't dug much into this...
Anyway, I think distinguishing at least between estimation (CI) and population (PI) intervals is needed. (I thought it already was!) - shifting over to a unified term across both seems too opaque for my taste.
Bayes CIs = values the estimate probably is. Freq CIs = values the estimate is not probably not (😒)
The interpretation in terms of probability is of course not the same (and not valid for frq). But using compatible interval
is valid in both frameworks, I'd say.
(Un)Certainty/compatibility is a reasonably characterization across frameworks (the major distinction being to source of assurance in those statements).
For PI, the question I guess is then, which is better, consistent column names or precision in labeling? Given that test statistic columns are also variously named, labeling with PI might be better?
The problem I see is that, since CI columns can be often used for further nice formatting or plotting, it really helps to have a consistent naming scheme. I'd probably prefer to have everything named CI_low
and CI_high
(as columns) and ci
and ci_method
for the arguments of the width / method, and then do what we do in parameters, i.e., add information in the table header / footer about what does the table contain (like "The Certainty Intervals (CI) correspond here to Prediction Intervals"). I think that'd be the sleekest way of having a consistent api to work with + at the same time a transparent / accurate output
The problem I see is that, since CI columns can be often used for further nice formatting or plotting, it really helps to have a consistent naming scheme.
This sounds like a deviation from our API, where (unlike broom
) column name are informative and verbose.
I think that if we found a way to accommodate CI_low_Eta2_partial
, we can find a way to work with PI_low
.
But aren't predictions intervals "confidence"/certainty intervals around predictions? I.e., it's more the "of what" that changes rather than the "what"
CIs are uncertainty about some estimate - not always a "prediction". PIs represent the expected range of values in the population.
But even if this were the case, we do give different coefficients and effect sizes different names - but they all represent "some model parameter" or "some effect size". I think we should be consistent with our level of verbosity.
Also, you can have both CIs and PIs in the same results table... what then?
Bette change this now, while it is still easy...
PI_low
looks bizarre and it feels like a strange exception in the easyverse for something that behaves and is used (practically) very similarly to other intervals (whether to format in text or to plot using ribbons), and I don't see the obvious benefits for users to have PI instead of CI (given that we document the fact that for prediction intervals the CI stands for expected range of values in the population). I mean I don't see how using CI will confuse people any more than they would be already ^^... and as Daniel said if we start making an exception for that why not then be more specific using HDI, ETI, etc.?
I think we need to prioritize the sweet spot between intuitiveness/consistency and statistical specificity/accuracy
ETI, HDI, etc are methods of estimating CIs - whereas PIs are something else.
If they should be labelled the same because they are intervals, then wouldn't also the range
columns need to be some sort of CI_*
instead of min/max? No - because different intervals have different meanings.
(dist <- parameters::describe_distribution(mtcars))
#> Variable | Mean | SD | IQR | Range | Skewness | Kurtosis | n | n_Missing
#> --------------------------------------------------------------------------------------------
#> mpg | 20.09 | 6.03 | 7.53 | [10.40, 33.90] | 0.67 | -0.02 | 32 | 0
#> cyl | 6.19 | 1.79 | 4.00 | [4.00, 8.00] | -0.19 | -1.76 | 32 | 0
#> disp | 230.72 | 123.94 | 221.53 | [71.10, 472.00] | 0.42 | -1.07 | 32 | 0
#> hp | 146.69 | 68.56 | 84.50 | [52.00, 335.00] | 0.80 | 0.28 | 32 | 0
#> drat | 3.60 | 0.53 | 0.84 | [2.76, 4.93] | 0.29 | -0.45 | 32 | 0
#> wt | 3.22 | 0.98 | 1.19 | [1.51, 5.42] | 0.47 | 0.42 | 32 | 0
#> qsec | 17.85 | 1.79 | 2.02 | [14.50, 22.90] | 0.41 | 0.86 | 32 | 0
#> vs | 0.44 | 0.50 | 1.00 | [0.00, 1.00] | 0.26 | -2.06 | 32 | 0
#> am | 0.41 | 0.50 | 1.00 | [0.00, 1.00] | 0.40 | -1.97 | 32 | 0
#> gear | 3.69 | 0.74 | 1.00 | [3.00, 5.00] | 0.58 | -0.90 | 32 | 0
#> carb | 2.81 | 1.62 | 2.00 | [1.00, 8.00] | 1.16 | 2.02 | 32 | 0
colnames(dist)
#> [1] "Variable" "Mean" "SD" "IQR" "Min" "Max"
#> [7] "Skewness" "Kurtosis" "n" "n_Missing"
Created on 2021-04-08 by the reprex package (v1.0.0)
and is used (practically) very similarly to other intervals
... Well they shouldn't, and labelling PIs as CIs will only exacerbate this further :/
At the very least, we should make sure that in printing they should appear as 95% PI
.
I've read through the long previous thread on the predict
argument API. I can see the argument about Bayesian prediction intervals referring to a different posterior—the posterior on the prediction and not the mean—is important. In fact, that point applies equally well to frequentist models, and more clearly emphasizing what the uncertainty is characterizing around that point might resolve the CI/PI labeling issue.
What I'm thinking: When predict = "relation"
, what is being summarized is the posterior or confidence distribution on the conditional mean. When predict = "prediction"
, what is being summarized is the posterior or confidence distribution on individual predicted scores.
The "CI" label might be appropriate for everything if the title more clearly specifies "Conditional mean" or "Predicted scores" or similar. Maybe a label other than "relation"
for the conditional mean+its uncertainty? I admit I find that label somewhat unintuitive.
(I still find type
and interval
quite useful, because intuitive arguments and options are more difficult to find for even more complicated models that involve random effects and zero-inflation etc...)
The "CI" label might be appropriate for everything if the title more clearly specifies "Conditional mean" or "Predicted scores" or similar
That's feasible
Maybe a label other than "relation" for the conditional mean+its uncertainty? I admit I find that label somewhat unintuitive.
It was chosen based on the assumed endgoals of the process, to assess / visualize the "relationships" between predictors and outcome. After scratching my head a lost last time for some alternative, I think this one is not the worse, but it's true that it's fairly new (there aren't really instances of such term usage in base or extended R) and might sound odd at first... that said, I really don't know what a better alternative could be (I feel like "conditional_mean" or something more statically accurate would be more confusing than a simple term with a good and accurate description in the docs? But these again are my general priors on software dev so idk 🤷♂️)
(I feel like "conditional_mean" or something more statically accurate would be more confusing than a simple term with a good and accurate description in the docs? But these again are my general priors on software dev so idk 🤷♂️)
I guess my thinking is that there is a lot of confusion about what say OLS is actually modeling (the mean of a distribution), and there might be value in putting that more clearly in users' faces. Maybe even something like "expectation" or "center"? I think this becomes even more relevant if we think about something like a zero-inflation model, where "conditional", "zprob", etc. could also be added as options (cf. ?glmmTMB::predict.glmmTMB
).
Thinking of the predict
argument that way might also point to a clearer way to resolve cases like predict = "prediction"
for binomial models--that case should be the same as predict = "relation/expecation"
because they are the same for that family.
"expectation" is nice
Throughout the packages, the label "CI" is used for a bunch of kinds of intervals--confidence intervals, credible intervals, frequentist prediction intervals, posterior prediction intervals (all things other than . It's not clearly indicated what type of interval is given in the output. I like that the column names are consistent, but I wonder if a generic interval name other than CI would be better? (cf. in
broom::augment()
,.lower
and.upper
are used).