shannonpileggi / gtreg

Regulatory Tables for Clinical Research
https://shannonpileggi.github.io/gtreg/
GNU General Public License v3.0
37 stars 8 forks source link

Thresholding #205

Open mattkumar opened 1 year ago

mattkumar commented 1 year ago

Hello,

First, love the idea of this package - thank you for the effort! I unfortunately missed @shannonpileggi's talk at US Connect 2023, but since have managed to find some other materials and a previous presentation.

I'm wondering if the idea of thresholding is on the road map? This is something I see frequently requested when producing safety TLFs. As they can be often lengthy (and sparse), there's usually an interest in displaying adverse event terms that occur in at least X% of subjects.

I have a rough reprex below, though I'm not sure it's best implemented.

# Libs
library(dplyr)
library(gt)
library(gtreg)
library(gtsummary)
library(haven)

# Get source data
adsl_raw <- read_xpt("https://github.com/phuse-org/TestDataFactory/raw/main/Updated/TDF_ADaM/adsl.xpt")
adae_raw <- read_xpt("https://github.com/phuse-org/TestDataFactory/raw/main/Updated/TDF_ADaM/adae.xpt")

# Trim source data for brevity
adsl <- adsl_raw %>%
  select(USUBJID, TRT01A)

adae <- adae_raw %>%
  select(USUBJID, AEBODSYS, AEDECOD, AESEV) %>%
  filter(AEBODSYS %in% c("INJURY, POISONING AND PROCEDURAL COMPLICATIONS", "SKIN AND SUBCUTANEOUS TISSUE DISORDERS")) %>%
  left_join(adsl)

# Full table
out_table <- adae %>%
  tbl_ae(
    id = USUBJID,
    id_df = adsl,
    ae = AEDECOD,
    soc = AEBODSYS,
    by = AESEV,
    strata = TRT01A
  ) %>%
  bold_labels()

# Threshold table

# Set a threshold
threshold <- 2

# Use modify_table_body() to:
# for all columns with prefix 'stat', create a corresponding variable with suffix 'pct' that extracts out the percentage
# then, filter all variables ending with `pct` - in this case, stat_4* is not relevant
threshold_table <- out_table %>%
  gtsummary::modify_table_body(
    ~ .x %>%
      mutate(across(
        starts_with("stat"),
        list(pct = ~ as.numeric(stringr::str_extract(., "(?<=\\().+?(?=\\))")))
      )) %>%
      filter_at(
        vars(ends_with("pct") & !contains("_4")),
        any_vars(. >= threshold)
      )
  )
# might need to reconsider SOC totals? maybe, maybe not?

Original

xorg

Threshold @ 2%

xnew

I can imagine something like

adae %>%
  tbl_ae(
    id = USUBJID,
    id_df = adsl,
    ae = AEDECOD,
    soc = AEBODSYS,
    by = AESEV,
    strata = TRT01A,
    threshold = 2
  ) 

Best, Matt

shannonpileggi commented 1 year ago

Hi @mattkumar! Thank you for the kind words and for the thresholding suggestion. No, this had not yet been on our roadmap. Could you provide any references or papers that document methods for thresholding? There could be different ways of applying the threshold (overall, within treatment arm) and as you mentioned also implications for SOC counting.

mattkumar commented 1 year ago

Hi @shannonpileggi!

I've typically seen thresholds apply to percentages in these scenarios:

(1) in any treatment arms (i.e., at least one arm; applied to Preferred Term, SOCs omitted) (2) the total/overall column (applied to Preferred Term, SOCs omitted)

A link to the recent Standard Safety Tables and Figures: Integrated Guide by the FDA can be found here which highlights some of this. See Tables 13 and 55. Figure 5 it explicitly mentions 'in any treatment arm'.

What's more interesting is Table 58, where it looks like combinations of thresholds could be applied.

Customization Adjust the cutoffs, as deemed necessary (e.g., ≥5% in drug-treated group and ≥2% more in the drug-treated group than the placebo-treated group).

From the above, I imagine there could also be a need to view AEs of X% or greater anchored to a single treatment arm in the presence of many.