PEDSnet / Data-Quality-Analysis

The PEDSnet Data Quality Assessment Toolkit (OMOP CDM)
BSD 2-Clause "Simplified" License
24 stars 7 forks source link
data-quality data-quality-checks omop pedsnet

Data Quality Assessment in PEDSnet

Additional Resources

Introduction to the Tool

A summary of how to execute the tool for an initial run can be found here: (Initial Run)

Running for a PEDSnet Site

Instructions for how to execute the toolkit for a PEDSnet data submission can be found here: (PEDSnet Site)

Uploading to the Database

Issues are uploaded at the end of each cycle in their raw form to the database. The script to do this is included in the package here and utlizes the argos package in the standard approach: (Upload Issues)

To upload issues, set the variable to the directory where the resulting issue .csv files were output, specifcy the data version in the variable, and specify the site. Sourcing the script will upload the issues.

Objective

This toolkit has been designed for conducting data quality assessments on clinical datasets modeled using the OMOP common data model. The toolkit includes a wide variety of data quality checks and a GitHub-based issue reporting mechanism. The toolkit is being routinely used by the PEDSnet CDRN.

Contents

Required Downloads

R

R version 3.2.x or above, 64-bit (Comprehensive R Archive Network)

R Packages

install.packages(c("DBI","yaml","ggplot2","RJDBC","devtools","futile.logger","plyr","dplyr",
"dbplyr","lubridate", "tictoc", "testthat", "data.table"))

install.packages("RPostgres")
library(devtools)
install_github("baileych/ohdsi-argos")

Note: if previously installed, run update.packages() to get the latest version of each library