The EHR DREAM Challenge is a series of community challenges to pilot and develop a predictive analytic ecosystem within the healthcare system.
Evaluation of predictive models in the clinical space is an ever growing need for the Learning Health System. Healthcare institutions are attempting to move away from a rules-based approach to clinical care, toward a more data-driven model of care. To achieve this, machine learning algorithms are being developed to aid physicians in clinical decision making. However, a key limitation in the adoption and widespread deployment of these algorithms into clinical practice is the lack of rigorous assessments and clear evaluation standards. A framework for the systematic benchmarking and evaluation of biomedical algorithms - assessed in a prospective manner that mimics a clinical environment - is needed to ensure patient safety and clinical efficacy.
The primary objectives of the EHR DREAM Challenge are to :
We are tackling the stated problem by focusing on a specific prediction problem: patient mortality. Due to it's well studied nature and relatively well-established predictiveness, patient mortality serves as a well-defined benchmarking problem for assessing predictive models. These models are also widely adopted and implemented at healthcare institutions and CTSAs, a feature we hope will stimulate participation from a wide range of institutions.
DREAM challenges are an instrumental tool for harnessing the wisdom of the broader scientific community to develop computational solutions to biomedical problems. While previous DREAM challenges have worked with complex biological data as well as sensitive medical data, running DREAM Challenges with Electronic Health Records present unique complications, patient privacy being at the forefront of those concerns. Previous challenges have developed a technique known as the Model to Data (MTD) approach to maintain the privacy of the data. We will be using this MTD approach, facilitated by docker, on an OMOP dataset provided by the University of Washington to make development of models standardized.
We will ask participants of this DREAM Challenge to predict the future mortality status of currently living patients within our OMOP repository. After participants predict, we will evaluate the model performances against a gold standard benchmark dataset. We will carry out this DREAM challenge in three phases (Fig 1).
The Open Phase will be a preliminary testing and validation phase. In this phase, the Synpuf synthetic OMOP data will be used to test submitted models. Participants will submit their predictive models to our system where those models will train and predict on the split Synpuf dataset. The main objectives of the first phase are to allow the participants to become familiar with the submission system, to allow the organizers to work out any issues in the pipeline, and to give participants a preliminary ranking on the performance of their model.
The Leaderboard Phase will be the prospective prediction phase carried out on UW OMOP data. Participants will submit their models which will have a portion of the UW OMOP repository available to them for training, making predictions on all living patients who have had at least one visit in the previous month. They will predict whether these patients will be deceased in the next 6 months by assigning a probability score to each of the patients. Participants will be expected to setup up their own training dataset but the patient numbers for which a prediction is expected will be provided to the docker models.
The Validation Phase will be the final evaluation phase where challenge admins are able to finalize the scores of the models.
Figure 1. The Open Phase will feature a synethic training and test set to test the pipeline and participant models. The Leaderboard Phase will feature model submissions being evaluated against the UW OMOP repository. The gold standard benchmark set will be withheld from the docker models and used to evaluate model performance. Model performance metrics will be returned to the participants via Synapse.
Lead(s) (email) | Site |
---|---|
Tim Bergquist | UW |
Sean Mooney | UW |
Justin Guinney | Sage Bionetworks |
Thomas Schaffter | Sage Bionetworks |
Project Team Members
See Team README
Due Date | Milestone | Status |
---|---|---|
Feb 4 | Complete the aggregation and quality assessment of the UW cohort that will be used in the study. | Done |
Feb 27 | Conduct an internal evaluation by applying previously developed models to the UW cohort. | Done |
March 6 | Survey the CTSAs to find which sites have mortality and 30-day re-admission prediction models that would be willing to participate. | Ongoing |
March 20 | Build the Synapse pilot challenge site with instructions for participating in the challenge. | Ongoing |
April | Build the infrastructure for facilitating the DREAM challenge, using Docker, Synapse, and UW servers. | Ongoing |
June | Phase 1: Have a period of time where the parties identified in step 1 submit their models to predict on UW patients. This will not be a prospective evaluation. | Not Started |
Summer | Phase 2: Prospectively evaluate model performances, evaluating accuracy and recall between models. | Not Started |
Jan 2020 | Make scripts and documentation available for the CTSAs. | Ongoing |
See here
See here
See here
The project Google drive folder is accessible to onboarded participants.
The project slack room is accessible to onboarded participants.