insightsengineering / formatters

A framework for creating listings of raw data that include specialized formatting, headers, footers, referential footnotes, and pagination.
https://insightsengineering.github.io/formatters/
Other
15 stars 6 forks source link

format for mean with standard deviation #252

Closed pzhang-cims closed 9 months ago

pzhang-cims commented 9 months ago

Summary

Hi there, I would like to check if we have features for formatting means and standard deviation with one more digits. For example, Mean (SD) = 32.487 (40.5555), with xx.x (xx.xx) for 32.5 (40.56), with xx.xx (xx.xxx) for 32.49 (40.556). Sorry if I miss anything here. If we do not plan such, do we have any customization function that we can apply?

Thanks,

Peng

Melkiades commented 9 months ago

Hi Peng! At the moment, we do not have a specific format for "xx.xx (xx.xxxx)", but there is a way to get what you need ;). First of all you will not find it in the defaults (formatters::list_valid_format_labels()), but if you check some of the function docs you will find that formats can be a function (e.g. https://insightsengineering.github.io/rtables/main/reference/format_rcell.html). It is missing, I believe (right @edelarua?) a vignette covering this, but I can guide you through this one.

A basic table:

library(rtables)
lyt <- basic_table() %>%
  split_cols_by("Species") %>%
  analyze(c("Sepal.Length", "Petal.Width"), afun = function(x) {
    list(
      "mean (sd)" = rcell(c(mean(x), sd(x)), format = "xx.xx (xx.xx)"),
      "range" = diff(range(x))
    )
  })

build_table(lyt, iris)

Let's modify the format as you want it there with a function (accepts a number and returns a string; use browser() there for more customization and to see the variables):

my_format <- function(res, ...) { # it needs dots to work with rtables
  # browser()
  num1 <- formatC(res[1], format = "f", digits = 2)
  num2 <- formatC(res[2], format = "f", digits = 3)
  paste0(num1, " (", num2, ")")
}

lyt <- basic_table() %>%
  split_cols_by("Species") %>%
  analyze(c("Sepal.Length", "Petal.Width"), afun = function(x) {
    list(
      "mean (sd)" = rcell(c(mean(x), sd(x)), format = my_format),
      "range" = diff(range(x))
    )
  })

build_table(lyt, iris)

The best way finally is to use {tern} dedicated function tern::format_xx("xx.xx (xx.xxx)") in place of my_format. This is probably the cleanest way to do this.

pzhang-cims commented 9 months ago

Thank you Melkiades for your detailed explanation. Instead of analyze_vars, I tried from my side using analyze to derive desired output. While I did similar way to your formatC manipulation, your suggestion of mentioning format_xx can be more efficient to use. Thanks for this helpful explanation!

Oh! Just notice that format_xx can be used in analyze_vars as well. This can be great!