[Bug]: tm_t_summary module - dataname parameter does not accept other names than ADSL

Mia-data commented 2 weeks ago

I know that tm_t_summary is from teal.clinical.modules and devoted for clinical data, but it woould be nice to be able to use it also for non-clinical data. tm_t_summary modules works under the condition that my non-clinical data is renamed to "ADSL" and contains artificially made viariables USUBJID and STUDYID. It makes sense to have variable with unique ID but does it have to be named as USUBJID? It would be nice if in the parameters of tm_t_summary I could provide

custom datasetname (other than ADSL)
custom key-column (other than USUBJID)
and STUDYID column is optional

Below reprex example

library(teal.modules.general)
#> Loading required package: ggmosaic
#> Loading required package: ggplot2
#> Loading required package: shiny
#> Loading required package: teal
#> Loading required package: teal.data
#> Loading required package: teal.code
#> Loading required package: teal.slice
#> Registered S3 method overwritten by 'teal':
#>   method        from      
#>   c.teal_slices teal.slice
#> 
#> You are using teal version 0.15.2
#> 
#> Attaching package: 'teal'
#> The following objects are masked from 'package:teal.slice':
#> 
#>     as.teal_slices, teal_slices
#> Loading required package: teal.transform
library(teal.modules.clinical)
#> Loading required package: tern
#> Loading required package: rtables
#> Loading required package: formatters
#> Loading required package: magrittr
#> 
#> Attaching package: 'rtables'
#> The following object is masked from 'package:utils':
#> 
#>     str
#> Registered S3 method overwritten by 'tern':
#>   method   from 
#>   tidy.glm broom
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)
#> 
#> Attaching package: 'tidyr'
#> The following object is masked from 'package:magrittr':
#> 
#>     extract
options(shiny.useragg = FALSE)

cars<-mtcars
cars$brand<-rownames(cars)
rownames(cars)<-NULL

#create dummy variables USUBJID and STUDYID required by teal.modules.clinical
cars<-cars %>% mutate(USUBJID=row_number(), STUDYID="ABC") %>% 
  mutate(across(where(is.character), as.factor))

lbl<-names(cars)
Hmisc::label(cars)<-as.list(lbl)

data<-cdisc_data(cars=cars)
datanames(data)<-c("cars")

# app
app <- init(
  data = data,
  modules = modules(
    tm_data_table("Data Table"),
    tm_variable_browser("Variable Browser"),
    tm_missing_data("Missing Data"),
    tm_t_summary(
      label = "Demographic Table",
      dataname = "cars",
      arm_var = choices_selected("brand"),
      add_total = TRUE,
      summarize_vars = choices_selected(
        c("cyl", "hp", "carb")),
      useNA = "ifany"
    )
  ),
  header = "Example"
)
#> [INFO] 2024-11-04 20:34:55.9038 pid:1333 token:[] teal.modules.general Initializing tm_data_table
#> [INFO] 2024-11-04 20:34:55.9095 pid:1333 token:[] teal.modules.general Initializing tm_variable_browser
#> [INFO] 2024-11-04 20:34:55.9711 pid:1333 token:[] teal.modules.general Initializing tm_missing_data
#> Initializing tm_t_summary
#> [ERROR] 2024-11-04 20:34:56.2488 pid:1333 token:[] teal - Module 'Demographic Table' uses datanames not available in 'data': ("ADSL") not in ("cars")
#> Error: Assertion on 'modules' failed: - Module 'Demographic Table' uses datanames not available in 'data': ("ADSL") not in ("cars").
shinyApp(app$ui, app$server)
#> Error in shinyApp(app$ui, app$server): object 'app' not found

Relevant log output

Error: Assertion on 'modules' failed: - Module 'Demographic Table' uses datanames not available in 'data': ("ADSL") not in ("cars").

Code of Conduct

[X] I agree to follow this project's Code of Conduct.

Contribution Guidelines

[X] I agree to follow this project's Contribution Guidelines.

Security Policy

[X] I agree to follow this project's Security Policy.

Mia-data commented 2 weeks ago

And this app below works just simply changing dataset name from "cars" to "ADSL" but it would be nice to have in the running application the real name of dataset instead of dummy name "ADSL"

library(teal.modules.general)
library(teal.modules.clinical)
library(dplyr)
library(tidyr)
options(shiny.useragg = FALSE)

cars<-mtcars
cars$brand<-rownames(cars)
rownames(cars)<-NULL

#create dummy variables USUBJID and STUDYID required by teal.modules.clinical
cars<-cars %>% mutate(USUBJID=row_number(), STUDYID="ABC") %>% 
  mutate(across(where(is.character), as.factor))

lbl<-names(cars)
Hmisc::label(cars)<-as.list(lbl)

data<-cdisc_data(ADSL=cars)
datanames(data)<-c("ADSL")

# app
app <- init(
  data = data,
  modules = modules(
    tm_data_table("Data Table"),
    tm_variable_browser("Variable Browser"),
    tm_missing_data("Missing Data"),
    tm_t_summary(
      label = "Demographic Table",
      dataname = "ADSL",
      arm_var = choices_selected("brand"),
      add_total = TRUE,
      summarize_vars = choices_selected(
        c("cyl", "hp", "carb")),
      useNA = "ifany"
    )
  ),
  header = "Example"
)
shinyApp(app$ui, app$server)

vedhav commented 2 weeks ago

The first line of the README in teal.modules.clinical reads:

This package contains a set of standard teal modules to be used with CDISC data in order to generate many of the standard outputs used in clinical trials.

The package is specifically designed to work with CDISC standard datasets. Dataset names and column names are part of this standard, which allows the modules to seamlessly interact with clinical trial data by assuming certain structures and identifiers, such as ADSL for subject-level information and USUBJID as a unique subject identifier. Adhering to these standards enables consistent and reproducible analysis across clinical datasets.

Given this design, extending tm_t_summary for non-CDISC datasets would require a different approach, and it should be outside the scope of this package.

Another important thing to point out is that the teal modules from this package are shiny/teal abstractions over the package tern after following ADaM data standard to make sure it is easy to manage and create standard widgets for data manipulation before creating the TLGs. And tern is specifically designed to create TLGs for clinical trial reporting. For example the teal.modules.clinical::tm_g_km module uses https://insightsengineering.github.io/tern/latest-tag/reference/g_km.html under the hood.

@Mia-data my question is: Why do you think it's nice to be able to use the teal modules from this package also for non-clinical data?

Mia-data commented 2 weeks ago

@vedhav I understand the dedication of this package to clinical data. Maybe then you could consider to add tm_t_summary module for the package teal.modules.general? Such feature would greatly support work on medical real-world data from medical insurance companies, hospitals and registries. Real-world data has been gaining more and more attention of FDA over last years. The importance of real-world studies shouldn't be neglected and shouldn't be treated as inferior to clinical studies. Conversly, real-world studies provide insight what is really happening in real-world setting with a broad spectrum of patients that usually are excluded in clinical trials. Therefore, I think that adding a feature of tm_t_summary modlue to teal.modules.general will be GREAT benefit with low workload since the code just night to be slightly modified from clinical to more generic convention. Thank you for you consideration.

And in the meantime, I will use workaround to force teal.modules.clinical to work with my non-clinical data from disease registry

insightsengineering / teal.modules.clinical