som-shahlab / ehr_ml

Code for doing machine learning with various EHRs
MIT License
21 stars 3 forks source link

Fetch description string for each terminology/code token #24

Open jason-fries opened 3 years ago

jason-fries commented 3 years ago

Issue

Currently timelines are defined using a vocabulary that is the union of several source terminologies. These are opaque for debugging and manual inspection. For future research, we'd also like to project codes latent spaces based on language rather than terminologies.

Proposed Change

Add functionality to take the UMLS or other ontology sources and provide mappings from code->string description.