MSKCC-Epi-Bio / bstfun

Miscellaneous collection of functions
http://mskcc-epi-bio.github.io/bstfun
Other
34 stars 23 forks source link

Feature request: a tbl_likert ? #53

Closed larmarange closed 2 years ago

larmarange commented 2 years ago

Opening this issue just to open a discussion on how to display Likert scales items with gtsummary and to see if it would be relevant (or not) to propose something.

Many social surveys includes attitudes/opinions questions using a Likert scales. The format of a typical five-level Likert item, for example, could be:

  1. Strongly disagree
  2. Disagree
  3. Neither agree nor disagree
  4. Agree
  5. Strongly agree

In such a case, we have a set of categorical variables sharing the same set of levels. (sometimes, some work could be required to ensure that all levels are defined in all factors).

Displaying them with tbl_summary() will result in a long one-column table with the items repeated several times.

Another option could be to apply tbl_summary() on each variable and then to merge all sub-tables with tbl_merge() (or using tbl_strata()).

However, if we have a lot of items, it could be better to display factor levels in separate columns and variables in different rows. But I do not think that we currently have the equivalent of a tbl_transpose() in gtsummary.

Would/Should it be relevant to think about a dedicated tbl_likert() function or would it too specific? If such a function is added, would it be relevant to also propose a plot() method?

ddsjoberg commented 2 years ago

I think it seems reasonable! It would look like this, right?

library(gtsummary)
library(tidyverse)
#> Warning: package 'tibble' was built under R version 4.1.1
#> Warning: package 'tidyr' was built under R version 4.1.1
#> Warning: package 'readr' was built under R version 4.1.1

df_likert <-
  map_dfc(
    1:3,
    ~sample.int(5, replace = TRUE) %>%
      factor(levels = 1:5,
             labels = c("Strongly disagree",
                        "Disagree",
                        "Neither agree nor disagree",
                        "Agree",
                        "Strongly agree"))
  ) %>%
  set_names(paste0("var", 1:3)) 
#> New names:
#> * NA -> ...1
#> * NA -> ...2
#> * NA -> ...3

names(df_likert) %>%
  map(
    ~ df_likert %>%
      select(likert  = all_of(.x)) %>%
      tbl_summary() %>%
      modify_header(all_stat_cols() ~ .x)
  ) %>%
  tbl_merge() %>%
  modify_spanning_header(everything() ~ NA) %>%
  as_kable()
Characteristic var1 var2 var3
likert
Strongly disagree 1 (20%) 0 (0%) 0 (0%)
Disagree 2 (40%) 2 (40%) 1 (20%)
Neither agree nor disagree 1 (20%) 1 (20%) 1 (20%)
Agree 0 (0%) 2 (40%) 2 (40%)
Strongly agree 1 (20%) 0 (0%) 1 (20%)

Created on 2021-10-03 by the reprex package (v2.0.1)

We cannot use tbl_strata() here, but I think a new helper could be useful that is similar: tbl_map()?

larmarange commented 2 years ago

I like the idea of tbm_map().

Regarding Likert tables, it seems that they are usually presented in the other way (i.e. items in columns, vars in rows).

See examples:

ddsjoberg commented 2 years ago

Ohhh, I see , then tbl_stata() would be the way to go!

ddsjoberg commented 2 years ago

I think tbl_transpose() will disrupt the structure of a gtsummary table, and will cause issues with the harmony among the functions?

larmarange commented 2 years ago

Some other things (but I didn't explore it yet in details), it seems frequent to add a number of observations, or a mean value (considering that a Likert scales is also a score). All of that suggest that it could be relevant to develop a tbl_likert() with relevant methods, but maybe some literature review could be a good idea to identify the needed features.

Similarly, there is a likert package on CRAN (https://cran.r-project.org/web/packages/likert/index.html) but it doesn't seem to be still active (last version in 2016). And there are some common needs about Likert graph (then a plot method could be relevant)

See:

larmarange commented 2 years ago

I think tbl_transpose() will disrupt the structure of a gtsummary table, and will cause issues with the harmony among the functions?

TRUE

ddsjoberg commented 2 years ago

I think a tbl_likert() function will be helpful to many. Perhaps we should add it bstfun first to make it available, and to test how it harmonizes with other functions.

ddsjoberg commented 2 years ago

Hey hey @larmarange , I wrote a basic version...it was easier than I though it'd be! I was able to do it in a way that maintains the gtsummary structure pretty well!

library(bstfun)

df <-
  tibble::tibble(
    f1 = 
      sample.int(100, n = 3, replace = TRUE) %>%
      factor(levels = 1:3, labels = c("bad", "meh", "good")),
    f2 = 
      sample.int(100, n = 3, replace = TRUE) %>%
      factor(levels = 1:3, labels = c("bad", "meh", "good")),
  )

tbl_likert(df) %>%
  gtsummary::as_kable()
Characteristic bad meh good
f1 38 (38%) 26 (26%) 36 (36%)
f2 36 (36%) 30 (30%) 34 (34%)

Created on 2021-10-07 by the reprex package (v2.0.0)

larmarange commented 2 years ago

Thanks a lot @ddsjoberg

I'm currently traveling for work and will be back in mid-October. I will look at your proposal in more details. I have some ideas of possible improvements but need to test them first.