There are a bunch of entries in UMLS such as UMLS:C1847200 "Alzheimer Disease 4" that is noted as having a broader concept (UMLS:C0002395 "Alzheimer's Disease"). Yaphet has been running into issues where MedMentions uses a more specific ID while NodeNorm can only normalize a broader ID. So it would be useful if NodeNorm had some connection between either direct broader/narrow relationships from UMLS, all broader/narrower relationships from UMLS, or some sort of threshold.
Some options, arranged from easiest to hardest:
Leave it as-is, and let downstream users use MRREL from UMLS to figure out these broader/narrow relationships.
Make the leftover UMLS generator much more sophisticated, so that for every UMLS ID it tries to add, it first walks up the hierarchy and tries to find a UMLS ID that has already been normalized as part of Babel. If it finds one, it includes the second ID in the existing clique, perhaps with a flag to indicate that this should be treated as an imperfect match or something.
Downside: we've currently implemented the leftover UMLS output as dependent on the compendia files, so we would need to reprocess those files after they've already been generated, which would be at a minimum inelegant and probably also quite hairy.
If there is some kind of threshold we could use (i.e. some sort of indicator from UMLS that a particular ID is a good place to stop -- for example, UMLS:C1847200 only has a single broader concept, while UMLS:C0002395 has a ton of broader concepts), then we could turn this into a conflation and make it optional.
There are a bunch of entries in UMLS such as UMLS:C1847200 "Alzheimer Disease 4" that is noted as having a broader concept (UMLS:C0002395 "Alzheimer's Disease"). Yaphet has been running into issues where MedMentions uses a more specific ID while NodeNorm can only normalize a broader ID. So it would be useful if NodeNorm had some connection between either direct broader/narrow relationships from UMLS, all broader/narrower relationships from UMLS, or some sort of threshold.
Some options, arranged from easiest to hardest: