Closed cmungall closed 2 years ago
@cmungall , is this going to implemented globally? I just encountered https://monarchinitiative.org/disease/MONDO:0008947 where bilateral striopallidodentate calcinosis has a synonym of basal ganglia calcification, idiopathic, type 1 which is gene-specific. basal ganglia calcification, idiopathic, 1 MONDO:0024538 differs only by the term 'type'
And Trichohepatoenteric syndrome type 1 a synonym of tricho-hepato-enteric syndrome MONDO:0009105 when trichohepatoenteric syndrome 1 MONDO:0024541 also exists (that pesky 'type')
@nicolevasilevsky Here is an up-to-date list: https://docs.google.com/spreadsheets/d/1CJPJ8VasB-kGaUqY0jBBvFf2IwW5xbd3ZiZazx67eHo/edit?usp=sharing
Let me know how you want to go about this.
FYI, This is the check:
SELECT ?term ?term_label ?p ?pn ?xc ?xp {
?term rdfs:subClassOf ?p .
?exp owl:annotatedSource ?term ;
owl:annotatedProperty oboInOwl:hasDbXref ;
owl:annotatedTarget ?xref;
oboInOwl:source ?source .
?exp_p owl:annotatedSource ?p ;
owl:annotatedProperty oboInOwl:hasDbXref ;
owl:annotatedTarget ?xref_p;
oboInOwl:source ?source_p .
?term rdfs:label ?term_label .
?p rdfs:label ?pn .
FILTER (isIRI(?term) && regex(str(?term), "^http://purl.obolibrary.org/obo/MONDO_"))
FILTER (isIRI(?p) && regex(str(?p), "^http://purl.obolibrary.org/obo/MONDO_"))
FILTER(regex(str(?xref), "^OMIM:"))
FILTER(regex(str(?xref_p), "^OMIM:"))
FILTER(str(?source)="MONDO:equivalentTo")
FILTER(str(?source_p)="MONDO:equivalentTo")
}
I eyeballed some and some of these equivalencies come from Orphanet.
(The important thing here is to identify a systematic mistake we can just break in one go, or just keep going through the table until its done)
I'll take a look at this and think about it.
Some of these are being addressed in this ticket: https://github.com/monarch-initiative/mondo/issues/962
There are some cases where terms like this (https://omim.org/entry/109100) are used as grouping classes, which I don't necessarily think is wrong, but it is going to flag errors for this check.
And Trichohepatoenteric syndrome type 1 a synonym of tricho-hepato-enteric syndrome MONDO:0009105 when trichohepatoenteric syndrome 1 MONDO:0024541 also exists (that pesky 'type')
The other example above from @maglott has been resolved. (Thanks for your comments @maglott!)
I think this is done. Please reopen if further action is needed.
See #730
In general OMIM IDs be equivalenced only to "gene-level" disease classes. This needs to be written up in the docs and a check added. We should also consistently use subset tags to denote levels in the ontology.
The gene-level metaclass is classes that are defined according to a single gene
If we see a case where we infer one OMIM to be subclass of another this is likely a mistake, see docs on the "prototype" problem (in which a disease "Foo" is later split into "Foo 1" and "Foo 2" with the label "Foo" being ambiguous wrt whether it denotes the "classic" form "Foo 1" or a superclass).
There are cases where some OMIM IDs may still represent something above the gene level, and this is also tied in with susceptibility