stan-dev / posterior

The posterior R package
https://mc-stan.org/posterior/
Other
167 stars 24 forks source link

Suggestion: Add alternative uncertainty notation for rvar #299

Open StaffanBetner opened 1 year ago

StaffanBetner commented 1 year ago

While the current rvar notation of mean ± sd is straightforward, I think it might be beneficial to offer alternative notations. I've found the mean(sd) style to be quite popular in some contexts, e.g. in physics and chemistry.

To give a bit of context, the R package errors offers a similar functionality to rvar, but it is only based on mean and standard deviation:

> library(posterior)
> library(errors)
> set.seed(19725654)
> rnorm(100, mean = 5, sd=10) -> x
> mean(x)
[1] 5.892738
> sd(x)
[1] 11.32585
> errors::set_errors(mean(x), sd(x))
6(10)
> posterior::rvar(x)
rvar<100>[1] mean ± sd:
[1] 5.9 ± 11 

Considering this, would it be possible to introduce an R option, say rvar.uncertainty.notation? This could let users choose between the current plusminus style and the alternative parenthesis notation.

Thanks for considering!

mjskay commented 1 year ago

Doesn't seem unreasonable. We already change the output notation to show (scaled) entropy for factors and dissent for ordered factors; something like this:

> posterior::rvar(letters)
rvar_factor<26>[1] mode <entropy>:
[1] a <1> 
26 levels: a b c d e f g h i j k l m n o p q r s t u v w x y z

Since we already have a summary option to print.rvar() that determines the summary functions used, with default determined by getOption("posterior.rvar_summary"), I'd propose a notation option to print.rvar() with default determined by getOption("posterior.rvar_notation"), with three possible values:

We could also allow more custom formatting, e.g. by allowing notation to take a function that takes a character vector (or maybe numeric vector) of measures of uncertainty and returns a character vector of the same length, formatted as desired. Not sure if this level of customization is really necessary, because at that point you might as well just start formatting things manually by calling the summary functions directly.