cancerDHC / umls-rrf-scala

A very basic library for parsing files in the UMLS RRF format
MIT License
4 stars 2 forks source link

Add support for NCBO BioPortal mappings #15

Open gaurav opened 3 years ago

gaurav commented 3 years ago

NCBO BioPortal has used lexical searches to map SNOMED terms to other ontologies, which we can access via an API (e.g. http://data.bioontology.org/ontologies/SNOMEDCT/classes/http%3A%2F%2Fpurl.bioontology.org%2Fontology%2FSNOMEDCT%2F128965002/mappings).

The missing step in UMLS-RRF-Scala right now emits SNOMED IDs as CURIEs (i.e. SNOMEDCT_US:128965002). We will need to convert these into URIs so we can feed them into the Bioportal terms for queries. As far as I can tell, NCBO BioPortal mappings are of two kinds: "CUI" (i.e. extracted from the NCI Metathesaurus) and "LOOM" (NCBO lexical matching).

The mapping API doesn't appear to return information on the closeness of the match (skos:relatedMatch vs skos:closeMatch vs skos:exactMatch), which the documentation claims is available. So that's a bit confusing.