cancerDHC / umls-rrf-scala

A very basic library for parsing files in the UMLS RRF format
MIT License
4 stars 2 forks source link

Carry out mapping using the NCI Metathesaurus #1

Closed gaurav closed 4 years ago

gaurav commented 4 years ago

A simple term mapper built around the NCI Metathesaurus. It can produce the entire known mapping, or only map a list of input terms. Most RRF files are loaded directly into memory, but since two critical files we need (MRCONSO and MRHIER) are particularly large, we load those into a local SQLite3 database and query them from there. It also includes support for matching via the parent term, i.e. when term X in source A has no mapping in source B, but X's parent term does.

gaurav commented 4 years ago

Sorry for the post-review request changes, Jim: I fixed a couple of minor issues and added CUIs to the CSV output. This PR should be stable enough for you to review now!