Closed mgkahn closed 7 years ago
Our FHIR output (either the STU3 or DSTU2 versions) is probably the best option right now to work with. These are the "richest" output formats that we support.
Shahab is wrapping up his Synthea FHIR to OMOP work and I have asked him to start packaging and documenting this for submission to this repo. The work is pretty vanilla java code that reads a directory of Synthea FHIR files and creates/fills a Postgres DBMS that conforms to OMOP CDM V5. One area that is not done due to a very recent transition in OMOP terminologies is immunizations/vaccines, which will be left to be added by some future student. I will also engage future students to expand this work as new diseases, interventions, and outcomes are added to the Synthea library that expands the terms used in patient profiles.
Currently seeking advise how the community wishes for Shahab's work to be made available in this repo.
Good question. Given this is java code and relatively modular we could
synthetichealth/omop_import
(flexible on the name)synthetichealth/synthea_java
project (an pre-alpha java engine that is not yet mature)utilities
folder within Synthea that contains the java project in a subfolder@mgkahn @dehall what are your thoughts?
On one hand I think since it's java it would be appropriate to integrate into synthea_java
, but on the other hand given the uncertainty of that project maybe not. I would hate to see a student's work lost if we decided to rearchitect and rewrite synthea_java
.
For now my feeling is a separate repo might be best, since it's a standalone utility. Eventually it will be integrated directly into the code and be part of synthea proper. (like we did with the synthea_graphviz
repo)
If I follow @jawalonoski post, the synthea_java project is a re-implementation of the current Ruby-based simulator in java. This is something very different than a Synthea FHIR --> OMOP translator that happens to be written in Java. I like @jawalonoski second suggestion that seems to be endorsed by @dehall -- creating a synthea_OMOP repo that holds just this utility. Feels cleaner to me and allows me to have later students' contributions go into the code line w/o any impact on other parts of the project.
OK. I'll create the project then and give you access to it.
@mgkahn I've sent you an invite to collaborate on https://github.com/synthetichealth/synthea_omop. At this point, you can add your students to the team.
Done. Thanks. Should I mark this thread as closed or leave it open for others to find the OMOP repo?
I think we can close it now. You or your student should add an "OMOP" page to our main Wiki https://github.com/synthetichealth/synthea/wiki that describes and links to the project. This will help people find it.
Solved by an eternal team: https://github.com/OHDSI/ETL-Synthea 👍
Following an email exchange with Jason to set up this thread to announce that Shahab Helmi, a PhD student in Computer Science at the University of Colorado Denver, will be doing a summer project with me to create Synthea to OMOPCDM V5 output (see http://ohdsi.org). Due to time restrictions, Shahab plans on doing Version 1.0 as an external java program that picks up one of the existing Synthea output formats and generates an OMOP-compliant database. A future student who knows Ruby will be tasked with reimplementing this work and adding to the native Synthea export framework.
We will use this Issues thread for both updates and questions but I'll start with my first question here:
Thanks, Michael Kahn University of Colorado Anschutz Medical Campus Michael.Kahn@ucdenver.edu
PS: Wasn't clear that all of Shahab's code will be posted in this git repo to be available to the community.