Open cmungall opened 5 years ago
Also confusing: https://www.ncbi.nlm.nih.gov/medgen/CN043596 Early-Onset Familial Alzheimer Disease
Absolutely agreed we need to work on these. Just winding down 2 major efforts, so I may have a little time.
Perhaps related to this (or does it need a new ticket) can MONDO require that any term be unique? NCBI pulls the terms from MONDO into relational tables, and the August release had 831 terms that were synonyms to 2 MONDO identfiers, e.g. MONDO_0011906 3-beta-hydroxy-delta-5-C27-steroid oxidoreductase deficiency MONDO_0018841 3-beta-hydroxy-delta-5-C27-steroid oxidoreductase deficiency
Hm, yes, I see that.
It seems that synonym come from DO for MONDO_0018841 'congenital bile acid synthesis defect' and the synonym comes from Orphanet for MONDO_0011906 'congenital bile acid synthesis defect 1'.
I'll revise the synonym on the type 1 term to read '3-beta-hydroxy-delta-5-C27-steroid oxidoreductase deficiency type 1'.
@Orphanet @annieolry @ana-rath-orphanet - would you want to revise this synonym in Orphanet too?
@maglott this synonym has been fixed, you should see it in the next release
Hi, I I have well understood the issue, Orphanet won't update synonyms as congenital bile acid synthesis defect can be due to several enzyme deficiencies. 3-beta-hydroxy-delta-5-C27-steroid oxidoreductase deficiency is the one responsible for congenital bile acid synthesis defect type 1. Accordingly, "3-beta-hydroxy-delta-5-C27-steroid oxidoreductase deficiency type 1" seems to me redundant. On the other hand, '3-beta-hydroxy-delta-5-C27-steroid oxidoreductase deficiency' is not an exact synonym of congenital bile acid synthesis defect, just a part of it.
@nicolevasilevsky Not sure if this is a good one for me, unless there's some way I can help automate this. I'll await clarification here or in qc-call
.
I would like to keep this ticket mostly in my view for now, and I am no ready to discuss it! We should raise this again in 2022.
We should have better documentation on sets of mappings (ideally derived from metadata in the ontology). And for UMLS/MedGen specifically.
In contrast to OMIM/ORDO/NCIT we have not prioritized making equivalence interpretations for UMLS/MedGen mappings, it should be documented that these are incomplete.
Historically we have been conservative in declaring some UMLS mappings to be equivalent, as the semantics were not clear for many OMIM IDs that grouped a collection of susceptibilities, e.g AD, Schizophrenia.
For example with AD:
There are a couple of cases you could make here. One is that these are the "same" disease, since the 2nd one points to an OMIM ID that seems to be a grouping for familiar AD . And also the disease is explicitly marked as being autosomal dominant.
But then the common-sense argument is that C0002395 is named generically, so it should just be treated as mode-of-inheritenace agnostic and indeed etiology-agnostic Alzheimer disease.
This gets a bit tricker as we have entries like https://www.ncbi.nlm.nih.gov/medgen/C2931257 Alzheimer disease 1, which corresponds to a "disease" of the same name in MONDO but it's not clear if these should even be there (separate ongoing work on susceptibilities).
Our approach is inherently conservative and we shy away from making an equivalence axiom unless we are sure of the semantics of the thing we are linking to, which is inherently harder with UMLS which aggregates with implicit definitions (and we don't necessarily always agree with their groupings). So we just provide xrefs/mappings with no interpretation on them.
I think there is a strong case to be made here that we should be less conservative. Here we would just have a "trust the label" policy in the case of ambiguities (but this can get us into trouble).
I think we will just have to do the work here, if @maglott is willing we could work together on the remaining ambiguous MedGen mappings.
(this issue was prompted by a helpdesk question, hopefully this isn't too confusing to anyone reading it)