ResearchSoftwareInstitute / datatrans

1 stars 6 forks source link

doc for sparse format? #5

Open krobasky opened 6 years ago

krobasky commented 6 years ago

How does one extract data for the various columns in the load_df callback?

For example, r['patient_num'] is obvious, but how do we extract a value for 'LOINC:711-2' using "loinc_instance_num" and loinc_meta.csv?

Same questions for: mdctn_meta.csv, icd_meta.csv?

Can you offer some examples? Thanks!

xu-hao commented 6 years ago

The loinc_instance_num of LOINC:711-2 in the long-format table is loinc_instance_num_LOINC:711-2 in the wide format table. So basically, <col>_<code>.

Same for mdctn.

For icd the icd_code_<code> column is either t or doesn't exist. t indicates a it exists in the long format table, doesn't exist indicates it doesn't exists in the long format table. doesn't exists translates into None in the dict.