Gilead-BioStats / gsm

Good Statistical Monitoring R Package
https://gilead-biostats.github.io/gsm/
Apache License 2.0
36 stars 9 forks source link

R-CMD-check

Good Statistical Monitoring {gsm} R package

The {gsm} package provides a standardized Risk Based Quality Monitoring (RBQM) framework for clinical trials that pairs a flexible data pipeline with robust reports like the one shown below.

![](man/figures/gsm_report_screenshot_1.png)

This README provides a high-level overview of {gsm}; see the package website for additional details.

Background

The {gsm} package performs risk assessments primarily focused on detecting differences in quality at the site-level. "High quality" is defined as the absence of errors that matter. We interpret this as focusing on detecting potential issues related to critical data or process across the major risk categories of safety, efficacy, disposition, treatment, and general quality, where each category consists of one or more risk assessment(s). Each risk assessment will analyze the data to flag sites with potential issues and provide a visualization to help the user understand the issue. Some relevant references are provided below.

Process Overview

The {gsm} package establishes a data pipeline for RBM using R. The package provides a framework that allows users to assess and visualize site-level risk in clinical trial data. The package currently provides assessments for the following domains:

  1. Adverse Event Reporting Rate
  2. Serious Adverse Event Reporting Rate
  3. Non-important Protocol Deviation Rate
  4. Important Protocol Deviation Rate
  5. Grade 3+ Lab Abnormality Rate
  6. Study Discontinuation Rate
  7. Treatment Discontinuation Rate
  8. Query Rate
  9. Outstanding Query Rate
  10. Outstanding Data Entry Rate
  11. Data Change Rate
  12. Screen Failure Rate

All {gsm} assessments use a standardized 6 step data pipeline:

  1. Input_Rate - Converts raw data to input data.
  2. Transform - Converts input data to transformed data.
  3. Analyze - Converts transformed data to analyzed data.
  4. Threshold - Uses analyzed data to create one or more numeric thresholds.
  5. Flag - Uses analyzed data and numeric thresholds to create flagged data.
  6. Summarize - Selects key columns from flagged data to create summary data.

To learn more about {gsm}'s data pipeline, visit the Data Pipeline Vignette.

Reporting

Detailed RMarkdown/HTML reporting is built into {gsm}, and provides a detailed overview of all risk assessments for a given trial. For example, an AE risk assessment looks like this:

![](man/figures/gsm_report_screenshot_2.png)

Full reports for a sample trial run with {clindata} are provided below:

Quality Control

Since {gsm} is designed for use in a GCP framework, we have conducted extensive quality control as part of our development process. In particular, we do the following:

Additional detail, including links to functional documentation and vignettes, is available in the package website.