insightsengineering / rtables

Reporting tables with R
https://insightsengineering.github.io/rtables/
Other
222 stars 48 forks source link

Shown (N=xx) when using `col_counts` in `build_table()`, even if `show_colcounts = FALSE` is declared #914

Open chengzhang96 opened 1 month ago

chengzhang96 commented 1 month ago

When producing "Adverse Events by Highest NCI CTCAE Grade" table, I calculated (N=XX) by ACTARM based on my needs, so I don't really want the row below the Grade to show (N=XX) again. But even I used basic_table(show_colcounts = FALSE) it didn't work, I think this is probably due to col_counts, can I use any way to delete the (N=XX) from the last row in the header?

image

Here is a simple example:

library(dplyr)
library(rtables)
library(tern)

preprocess_adae <- function(adae) {
  adae %>%
    dplyr::group_by(ACTARM, USUBJID, AEBODSYS, AEDECOD) %>%
    dplyr::summarize(
      MAXAETOXGR = max(as.numeric(AETOXGR)),
      .groups = "drop"
    ) %>%
    dplyr::ungroup() %>%
    dplyr::mutate(
      MAXAETOXGR = factor(MAXAETOXGR),
      AEDECOD = droplevels(as.factor(AEDECOD))
    )
}

adae_max <- ex_adae %>%
  preprocess_adae() %>%
  df_explicit_na()

arm_label <- ex_adsl %>% group_by(ACTARM) %>% arrange(ACTARM) %>% mutate(ACTARM = paste0(ACTARM, "\n(N=", n(), ")")) %>% ungroup() %>% distinct(ACTARM) %>% pull()
adsl <- ex_adsl %>% mutate(ACTARM = factor(ACTARM, labels = arm_label))
adae_max <- adae_max %>% mutate(ACTARM = factor(ACTARM, labels = arm_label))

grade_groups <- list(
  "Any Grade (%)" = c("1", "2", "3", "4", "5"),
  "Grade 3-4 (%)" = c("3", "4"),
  "Grade 5 (%)" = "5"
)

col_counts <- rep(table(adsl$ACTARM), each = length(grade_groups))
tbl <- basic_table(show_colcounts = FALSE) %>%
  split_cols_by("ACTARM") %>%
  split_cols_by_groups("MAXAETOXGR", groups_list = grade_groups) %>%
  summarize_num_patients(
    var = "USUBJID",
    .stats = "unique",
    .labels = "Total number of patients with at least one adverse event"
  ) %>%
  build_table(adae_max, col_counts = col_counts)
tbl
Melkiades commented 1 month ago

You need to take out the last col_counts call

tbl <- basic_table(show_colcounts = FALSE) %>%
  split_cols_by("ACTARM") %>%
  split_cols_by_groups("MAXAETOXGR", groups_list = grade_groups) %>%
  summarize_num_patients(
    var = "USUBJID",
    .stats = "unique",
    .labels = "Total number of patients with at least one adverse event"
  ) %>%
  build_table(adae_max)
tbl
chengzhang96 commented 1 month ago

Hi Melkiades, thanks for the suggestion. When canceling out the col_counts, the calculated percentage result is incorrect, that's the reason why I have to call col_counts. image I found two ways to solve this problem:

  1. generate an empty table without calling col_counts and assign the col_info of the new table to the original one
    tbl_new <- basic_table() %>%
    split_cols_by("ACTARM") %>%
    split_cols_by_groups("MAXAETOXGR", groups_list = grade_groups) %>%
    analyze("USUBJID", afun = function (x) {rcell("")}) %>%
    build_table(adae_max)
    col_info(tbl) <- col_info(tbl_new)
    tbl
  2. when I use subset on the current result, (N=XX) disappears again! So I can use this feature to reassemble the table to get the result I want.
    cbind_rtables(tbl[,1:length(grade_groups)], tbl[,-(1:length(grade_groups))])

    The second method, I personally don't think it's a bug, rather it's the correct result to make show_colcounts = FALSE work, I think show_colcounts should be prioritized over col_counts, for this part, I can give you a simple example for you to test:

    
    lyt <- basic_table(show_colcounts = FALSE) %>%
    split_cols_by("ARM") %>%
    analyze("AGE")

tbl <- build_table(lyt, DM, col_counts = 1:3) tbl tbl[,1] cbind_rtables(tbl[,1], tbl[,-1])

Melkiades commented 1 month ago

@chengzhang96 the issue here is that you build the table only using adae_max so you cannot get adsl unique information if not from external sources. To have this info you can use alt_counts_df in build_table, but you would anyway use the last column split as values. Personally, I find the following solution simpler:

col_counts <- table(ex_adsl$ACTARM) # Avoid having \n for detection

# See ?additional_fun_params for more information on how to use additional_fun_params for analysis functions
my_afun <- function(x, .spl_context, ...) { #.spl_context has a lot of useful information about splits
  # browser() # if you need to check values
  what_colcount_to_take <- sapply(names(col_counts), stringr::str_detect, string = .spl_context$cur_col_id)
  n_x <- length(unique(x))
  d_col_count <- col_counts[what_colcount_to_take]
  rcell(c(n_x, n_x / d_col_count), format = "xx (xx.x%)")
}

tbl <- basic_table(show_colcounts = FALSE) %>%
  split_cols_by("ACTARM") %>%
  split_cols_by_groups("MAXAETOXGR", groups_list = grade_groups) %>%
  analyze("USUBJID", afun = my_afun, var_labels = "Total number of patients with at least one adverse event") %>%
  # analyze_num_patients( # correct function to do what you wanted in the first place
  #   var = "USUBJID",
  #   .stats = "unique",
  #   # count_by = col_counts, # does not work, could be a bug in {tern}
  #   .labels = "Total number of patients with at least one adverse event"
  # ) %>%
  build_table(adae_max)
tbl

Remember that you can always forget about {tern} which often hides options for ease-of-use, and use directly analyze in {rtables}. You can do many wonderful things with analysis functions and split functions.

I am reopening this as trying to do what you wanted in the first place seems legitimate to me now. @gmbecker wdyt?

Melkiades commented 1 day ago

@chengzhang96 I was looking at this again. Could you try using alt_count_df from build_table? Because I think if you add col_counts that way it is only changing the top values? I am not sure. We usually use the above value to adjust the col counts