National-COVID-Cohort-Collaborative / Data-Ingestion-and-Harmonization

Data Ingestion and Harmonization
41 stars 12 forks source link

Data Quality aggregate metadata / report #35

Open DaveraGabriel opened 4 years ago

DaveraGabriel commented 4 years ago

As data payloads are ingested through the pipeline, data quality checks will be performed on individual payloads and this information will be made available to each site. What is of issue is determining the data quality checks that will be performed on merged data or which data quality metadata from individual payloads will be leveraged to represent the quality of a merged data set resulting from the DI&H process

DaveraGabriel commented 4 years ago

As a result of collaboration with the clinical scenarios and collaborative analytics groups, an example of data characterization and quality checking required after the merge to data store steps are time-dependent confounders in Safe Harbor data set. For example: the date of a shelter in place order int he community or imposing universal precautions in clinical settings external to an individual patient record have fixed dates that are not shifted int he patient data.