Open mlbernauer opened 3 weeks ago
In total there appear to be 5732 distinct non-null NDCs in the hosp.prescriptions
table, I was able to map each NDC to at least one of the class vocabs, however some vocabs have more coverage than others. Here's a breakdown of how the NDCs from the prescription table map across the different vocabularies available within RxClass
vocab | mapped_ndcs | total_ndcs | fraction_mapped |
---|---|---|---|
MEDRT | 5152 | 5732 | 0.898813677599442 |
ATCPROD | 5119 | 5732 | 0.893056524773203 |
VA | 5061 | 5732 | 0.882937892533147 |
ATC | 4902 | 5732 | 0.85519888346127 |
SNOMEDCT | 4838 | 5732 | 0.844033496161898 |
FDASPL | 4353 | 5732 | 0.759420795533845 |
DAILYMED | 4331 | 5732 | 0.755582693649686 |
FMTSME | 1164 | 5732 | 0.203070481507327 |
NA | 579 | 5732 | 0.101011863224006 |
RXNORM | 394 | 5732 | 0.0687369155617586 |
As an example, one can use the ATC classification to query all NDCs for Antiinfectives for Systemic Use which has a top-level ATC classification code of J
select * from [mappings.drug_class] where relation_source like 'ATC%' and class_id like 'J%';
which should return 2109 results (785 distinct NDCs) across both the ATC and ATCPROD vocabularies.
One limitation of this mapping file is that it only contains mappings between drugs and their immediate parent class. In other words, if a drug maps to class A
and class A
maps to class B
there will no be relation between the drug and class B
. However, this limitation can be circumvented when using the ATC vocabs since their class_id
codes are "semantic identifiers" i.e. they contain information about the class hierarchy which can be exploited as in the example above
This PR adds a mapping file (
drug_mappings.csv
) which allows drugs inprescriptions
table to be organized into different classes (e.g. therapeutic, pharmacologic, mechanism of action, pharmacokinetic, among others).