synthetichealth / synthea

Synthetic Patient Population Simulator
https://synthetichealth.github.io/synthea
Apache License 2.0
2.14k stars 644 forks source link

Synthea to OMOP output #209

Closed mgkahn closed 7 years ago

mgkahn commented 7 years ago

Following an email exchange with Jason to set up this thread to announce that Shahab Helmi, a PhD student in Computer Science at the University of Colorado Denver, will be doing a summer project with me to create Synthea to OMOPCDM V5 output (see http://ohdsi.org). Due to time restrictions, Shahab plans on doing Version 1.0 as an external java program that picks up one of the existing Synthea output formats and generates an OMOP-compliant database. A future student who knows Ruby will be tasked with reimplementing this work and adding to the native Synthea export framework.

We will use this Issues thread for both updates and questions but I'll start with my first question here:

Thanks, Michael Kahn University of Colorado Anschutz Medical Campus Michael.Kahn@ucdenver.edu

PS: Wasn't clear that all of Shahab's code will be posted in this git repo to be available to the community.

jawalonoski commented 7 years ago

Our FHIR output (either the STU3 or DSTU2 versions) is probably the best option right now to work with. These are the "richest" output formats that we support.

mgkahn commented 7 years ago

Shahab is wrapping up his Synthea FHIR to OMOP work and I have asked him to start packaging and documenting this for submission to this repo. The work is pretty vanilla java code that reads a directory of Synthea FHIR files and creates/fills a Postgres DBMS that conforms to OMOP CDM V5. One area that is not done due to a very recent transition in OMOP terminologies is immunizations/vaccines, which will be left to be added by some future student. I will also engage future students to expand this work as new diseases, interventions, and outcomes are added to the Synthea library that expands the terms used in patient profiles.

Currently seeking advise how the community wishes for Shahab's work to be made available in this repo.

jawalonoski commented 7 years ago

Good question. Given this is java code and relatively modular we could

@mgkahn @dehall what are your thoughts?

dehall commented 7 years ago

On one hand I think since it's java it would be appropriate to integrate into synthea_java, but on the other hand given the uncertainty of that project maybe not. I would hate to see a student's work lost if we decided to rearchitect and rewrite synthea_java.

For now my feeling is a separate repo might be best, since it's a standalone utility. Eventually it will be integrated directly into the code and be part of synthea proper. (like we did with the synthea_graphviz repo)

mgkahn commented 7 years ago

If I follow @jawalonoski post, the synthea_java project is a re-implementation of the current Ruby-based simulator in java. This is something very different than a Synthea FHIR --> OMOP translator that happens to be written in Java. I like @jawalonoski second suggestion that seems to be endorsed by @dehall -- creating a synthea_OMOP repo that holds just this utility. Feels cleaner to me and allows me to have later students' contributions go into the code line w/o any impact on other parts of the project.

jawalonoski commented 7 years ago

OK. I'll create the project then and give you access to it.

jawalonoski commented 7 years ago

@mgkahn I've sent you an invite to collaborate on https://github.com/synthetichealth/synthea_omop. At this point, you can add your students to the team.

mgkahn commented 7 years ago

Done. Thanks. Should I mark this thread as closed or leave it open for others to find the OMOP repo?

jawalonoski commented 7 years ago

I think we can close it now. You or your student should add an "OMOP" page to our main Wiki https://github.com/synthetichealth/synthea/wiki that describes and links to the project. This will help people find it.

jawalonoski commented 5 years ago

Solved by an eternal team: https://github.com/OHDSI/ETL-Synthea 👍