MIT-LCP / mimic-omop

Mapping the MIMIC-III database to the OMOP schema
MIT License
128 stars 48 forks source link

Clarification on the "standard concepts from Athena" #45

Open smondet opened 6 years ago

smondet commented 6 years ago

The doc says:

The standard concepts from Athena have been downloaded and are available somewhere (including running the extra script to download CPT code definitions)

what is meant by "standard"? the pre-selected ones on the webapp? or is there a list somewhere?

alistairewj commented 6 years ago

Yeah not very well documented apologies. I am pretty sure the default concepts which are checked for download on the website were the only ones used. I successfully ran the ETL with the following files in my vocab folder:

-rw-rw-r-- 1 alistairewj alistairewj  475 Jun 15  2016 readme.txt
drwxrwxr-x 2 alistairewj alistairewj 4.0K Jun 15  2016 lib
-rw-rw-r-- 1 alistairewj alistairewj   32 Jun 15  2016 cpt.sh
-rw-rw-r-- 1 alistairewj alistairewj   31 Jun 15  2016 cpt.bat
-rw-rw-r-- 1 alistairewj alistairewj 7.2M Apr  7  2017 cpt4.jar
-rw-rw-r-- 1 alistairewj alistairewj 1.3G Sep 14  2017 vocab_download_v5_{9DBE59FA-D92B-0BB8-77A8-198AB3FE4736}.zip
-rw-r--r-- 1 alistairewj alistairewj 118M Sep 14  2017 DRUG_STRENGTH.csv
-rw-r--r-- 1 alistairewj alistairewj 896K Sep 14  2017 CONCEPT_CPT4.csv
-rw-r--r-- 1 alistairewj alistairewj 1.3G Sep 14  2017 CONCEPT_RELATIONSHIP.csv
-rw-r--r-- 1 alistairewj alistairewj 2.3G Sep 14  2017 CONCEPT_ANCESTOR.csv
-rw-r--r-- 1 alistairewj alistairewj 4.8K Sep 14  2017 VOCABULARY.csv
-rw-r--r-- 1 alistairewj alistairewj  30K Sep 14  2017 RELATIONSHIP.csv
-rw-r--r-- 1 alistairewj alistairewj 1.2K Sep 14  2017 DOMAIN.csv
-rw-r--r-- 1 alistairewj alistairewj 373M Sep 14  2017 CONCEPT_SYNONYM.csv
-rw-r--r-- 1 alistairewj alistairewj  13K Sep 14  2017 CONCEPT_CLASS.csv
-rw-r--r-- 1 alistairewj alistairewj 493M Feb 16 09:35 CONCEPT.csv

If you can figure this out please do let us know it would save us a bit of effort in re-downloading and testing.

parisni commented 6 years ago

Guys

easy to remember: when designing the etl we took all the available concepts from athena that didn't need any proprietary licensing

2018-04-05 20:57 GMT+02:00 Alistair Johnson notifications@github.com:

Yeah not very well documented apologies. I am pretty sure the default concepts which are checked for download on the website were the only ones used. I successfully ran the ETL with the following files in my vocab folder:

-rw-rw-r-- 1 alistairewj alistairewj 475 Jun 15 2016 readme.txt drwxrwxr-x 2 alistairewj alistairewj 4.0K Jun 15 2016 lib -rw-rw-r-- 1 alistairewj alistairewj 32 Jun 15 2016 cpt.sh -rw-rw-r-- 1 alistairewj alistairewj 31 Jun 15 2016 cpt.bat -rw-rw-r-- 1 alistairewj alistairewj 7.2M Apr 7 2017 cpt4.jar -rw-rw-r-- 1 alistairewj alistairewj 1.3G Sep 14 2017 vocab_downloadv5{9DBE59FA-D92B-0BB8-77A8-198AB3FE4736}.zip -rw-r--r-- 1 alistairewj alistairewj 118M Sep 14 2017 DRUG_STRENGTH.csv -rw-r--r-- 1 alistairewj alistairewj 896K Sep 14 2017 CONCEPT_CPT4.csv -rw-r--r-- 1 alistairewj alistairewj 1.3G Sep 14 2017 CONCEPT_RELATIONSHIP.csv -rw-r--r-- 1 alistairewj alistairewj 2.3G Sep 14 2017 CONCEPT_ANCESTOR.csv -rw-r--r-- 1 alistairewj alistairewj 4.8K Sep 14 2017 VOCABULARY.csv -rw-r--r-- 1 alistairewj alistairewj 30K Sep 14 2017 RELATIONSHIP.csv -rw-r--r-- 1 alistairewj alistairewj 1.2K Sep 14 2017 DOMAIN.csv -rw-r--r-- 1 alistairewj alistairewj 373M Sep 14 2017 CONCEPT_SYNONYM.csv -rw-r--r-- 1 alistairewj alistairewj 13K Sep 14 2017 CONCEPT_CLASS.csv -rw-r--r-- 1 alistairewj alistairewj 493M Feb 16 09:35 CONCEPT.csv

If you can figure this out please do let us know it would save us a bit of effort in re-downloading and testing.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MIT-LCP/mimic-omop/issues/45#issuecomment-379041681, or mute the thread https://github.com/notifications/unsubscribe-auth/AJF_doZuig91pCWqwxD7_kGM7ECb_GwNks5tlmkegaJpZM4TI4sm .

parisni commented 3 years ago

This is an OMOP stuff. The athena licensing for cpt4 codes used to have a jar (this was years ago. no idea if this still apply)

stevenbedrick commented 3 years ago

Yeah, I figured it out the second after I added the comment! 🤣 In the Athena vocabulary distribution directory there's a JAR file and shell script that pull down CPT4 from UMLS. My confusion was due to the wording in omop/build-omop/postgresql/README.md, which I found a bit unclear as to where to find the Java file that it mentions- whether it is part of the MIMIC/OMOP EETL codebase, or part of the Athena distribution. The answer, of course, is that it's part of the Athena distribution.

I'd be happy to take a stab at clarifying that part of the instructions, if a PR would be welcome!