Closed chrisroederucdenver closed 3 weeks ago
CONCEPT_RELATIONSHIP is in the private repo as a gzipped file.
git@github.com:chrisroederucdenver/CCDA_OMOP_Private.git This repo uses git-lfs. I don't think it's an issue when cloning for read purposes, but it is for committing.
I wanted to see if Pandas could take the seemingly large concept and concept_relationship tables, so I wrote some experiemental code: https://github.com/chrisroederucdenver/CCDA-tools/blob/main/create_map_to_standard.py
Encounters --> Measurement Encounters --> Observation Hospital_Discharge --> Note Medications --> Drug Medications --> Observation Procedures --> Measurement Procedures --> Procedure Results --> total [Moles/volume] in Serum or Plasma" Results --> Measurement Results --> Observation Vital_Signs --> Measurement Vital_Signs --> Observation
FYI, and TODO
The OMOP vocabulary tables are in the CCDA_OMOP_Private repository.
The OID map is in the CCDA-data repository.
The output has a zillion columns. We need concept_id and domain_id for sure.
section is useful for considering the domain_id routing issue: what CCDA sections
produce data for which OMOP domains.
TODO: this code so far doesn't deal with dates or the invalid_reason column.
Update: Input tables are in Foundry:
The expanded one looks more like what Tanner and Steph have been working with, and it's possible that their created table will look like that.
One question worth considering carefully is if the result from Tanner and Steph will be a separate table, or additions to this one. It seems simpler with regards to separate processess that update input and output to have separate tables. If the vocab process by Steph and Tanner updates this table, it seems like it would have to be carefully coordinated with either re-writes or updates to add more input concepts. @tannerzhang @stephanieshong
This is part of @tannerzhang 's work in #120
2024-09-25 see update comment at bottom for current needs.
We need a table that maps from (OID, concept_code) to the concept_id of a standard concept related via 'maps to' to the input concept. It would help being able to use this table from Pandas if it were restricted to concepts found in the input data. An alternate scoping strategy would be to reduce the vocabularies used as the CONCEPT.csv over in CCDA_OMOP_Private may have more than we need?
Resources:
Depending on data size and machine capacity, this might be a challenge for Pandas.