monarch-initiative / mondo

Mondo Disease Ontology
http://obofoundry.org/ontology/mondo
Creative Commons Attribution 4.0 International
228 stars 53 forks source link

Inappropriate exactMatch and/or subClassOf relations identified by cross-reference with genes associated in OMIM #962

Closed tnavatar closed 3 years ago

tnavatar commented 4 years ago

Mondo term (ID Label)

["Ehlers-Danlos syndrome, classic type" "http://purl.obolibrary.org/obo/MONDO_0007522"] ["Usher syndrome type 1" "http://purl.obolibrary.org/obo/MONDO_0010168"] ["hypogonadotropic hypogonadism 14 with or without anosmia" "http://purl.obolibrary.org/obo/MONDO_0013926"] ["hypogonadotropic hypogonadism 19 with or without anosmia" "http://purl.obolibrary.org/obo/MONDO_0014105"] ["hypogonadotropic hypogonadism 20 with or without anosmia" "http://purl.obolibrary.org/obo/MONDO_0014106"] ["hypogonadotropic hypogonadism 21 with or without anosmia" "http://purl.obolibrary.org/obo/MONDO_0014107"] ["hypogonadotropic hypogonadism 18 with or without anosmia" "http://purl.obolibrary.org/obo/MONDO_0014103"] ["hypogonadotropic hypogonadism 22 with or without anosmia" "http://purl.obolibrary.org/obo/MONDO_0014461"] ["hypogonadotropic hypogonadism 17 with or without anosmia" "http://purl.obolibrary.org/obo/MONDO_0014102"] ["familial temporal lobe epilepsy 1" "http://purl.obolibrary.org/obo/MONDO_0010898"] ["benign paroxysmal positional nystagmus" "http://purl.obolibrary.org/obo/MONDO_0008656"] ["Wiskott-Aldrich syndrome" "http://purl.obolibrary.org/obo/MONDO_0010518"] ["46,XX testicular disorder of sex development" "http://purl.obolibrary.org/obo/MONDO_0010766"] ["Landau-Kleffner syndrome" "http://purl.obolibrary.org/obo/MONDO_0009509"] ["microcephaly, short stature, and impaired glucose metabolism 1" "http://purl.obolibrary.org/obo/MONDO_0000208"] ["familial multinodular goiter" "http://purl.obolibrary.org/obo/MONDO_0007681"] ["familial temporal lobe epilepsy 2" "http://purl.obolibrary.org/obo/MONDO_0011965"] ["classic progressive supranuclear palsy syndrome" "http://purl.obolibrary.org/obo/MONDO_0010997"] ["autosomal systemic lupus erythematosus" "http://purl.obolibrary.org/obo/MONDO_0013743"] ["familial pseudohyperkalemia" "http://purl.obolibrary.org/obo/MONDO_0012204"] ["insomnia (disease)" "http://purl.obolibrary.org/obo/MONDO_0013600"] ["adult Refsum disease" "http://purl.obolibrary.org/obo/MONDO_0009958"] ["alopecia universalis" "http://purl.obolibrary.org/obo/MONDO_0008757"] ["Ehlers-Danlos syndrome, arthrochalasis type" "http://purl.obolibrary.org/obo/MONDO_0007525"] ["Bruton-type agammaglobulinemia" "http://purl.obolibrary.org/obo/MONDO_0010421"] ["autosomal recessive cutis laxa type 2, classic type" "http://purl.obolibrary.org/obo/MONDO_0009054"] ["paroxysmal nonkinesigenic dyskinesia 1" "http://purl.obolibrary.org/obo/MONDO_0007326"] ["juvenile myoclonic epilepsy" "http://purl.obolibrary.org/obo/MONDO_0009696"] ["hyperphosphatemic familial tumoral calcinosis" "http://purl.obolibrary.org/obo/MONDO_0008897"] ["spondyloepiphyseal dysplasia with congenital joint dislocations" "http://purl.obolibrary.org/obo/MONDO_0007738"] ["familial calcium pyrophosphate deposition" "http://purl.obolibrary.org/obo/MONDO_0007319"] ["pseudo-TORCH syndrome" "http://purl.obolibrary.org/obo/MONDO_0009626"] ["PEHO syndrome" "http://purl.obolibrary.org/obo/MONDO_0009841"] ["isolated polycystic liver disease" "http://purl.obolibrary.org/obo/MONDO_0008265"] ["classic Hodgkin lymphoma" "http://purl.obolibrary.org/obo/MONDO_0009348"] ["Silver-Russell syndrome" "http://purl.obolibrary.org/obo/MONDO_0008394"] ["horizontal gaze palsy with progressive scoliosis" "http://purl.obolibrary.org/obo/MONDO_0011810"] ["aggressive periodontitis" "http://purl.obolibrary.org/obo/MONDO_0008226"]

Bug/Typo/Error description

Each of the above terms has an exactMatch relation in OMIM associated with a single gene, and also a direct subClassOf relation with an exactMatch in OMIM associated with a single, different gene. Essentially, this creates a structure in MONDO that suggests that a condition can be a subclass of a different condition with a totally different genetic context. In some cases (generally obvious by the label on the condition) this is due to an inaccurate exactMatch with an OMIM term. In other cases, (again, typically fairly obvious) this appears due to an inappropriate subClassOf relation.

This came up for us in context of reviewing ClinGen curations on Ehlers-Danlos. The issue we ran into was that curations linked to EDS type 2 were showing up as a subclass of curations linked to EDS type 1, when the two are associated with different genes (COL51A, COL51B). This led to running the query that generated the above list to see if there were additional, related inconsistencies.

Of these, there are some that are clear issues that ought to be fixed, (the EDS example). Others might be a bit more complicated due to OMIM's using the first term in a series as representative of the overall phenotype. (MONDO:0010168, Usher syndrome type 1, for example).

Please let me know if you would like additional information or if I can be of further assistance.

nicolevasilevsky commented 4 years ago

@tnavatar what do you mean by exactMatch? Do you mean has_exact_synonym? Or do you mean MONDO:equivalentTo?

Could you provide a more specific example of what the issue is? MONDO_0007522 Ehlers-Danlos syndrome, classic type has 5 parent classes. Is the issue with the parent MONDO_0020066 'Ehlers-Danlos syndrome', which has the dbxref OMIMPS:130000?

Thanks for clarifying. I'm happy to have a call with you, to get further clarification, if that would be helpful.

Thanks!

tnavatar commented 4 years ago

@nicolevasilevsky in the Ehlers-Danlos case, MONDO currently has the following relationships: [MONDO:0007522 owl:equivalentClass OMIM:130000 . MONDO:0019568 rdfs:subClassOf MONDO:0007522 . MONDO:0019568 owl:equivalentClass OMIM:130010 .] According to OMIM, OMIM:130000 is EDS exclusively in the context of variants in COL5A1 (https://www.omim.org/entry/130000), whereas OMIM:130010 refers to EDS exclusively in the context of variants in COL5A2. Judging from the matches MONDO has with the other vocabularies, it looks like the relationship should be OMIM:130000 owl:equivalentClass MONDO:0019567 (EDS type 1) instead of owl:equivalentClass MONDO:0007522. (EDS 'classic type')

After observing this, I looked for other ways some of the relationships in MONDO could be an issue for our curations. While that was the only one I found that touched on a term we had curated directly, there were several others (the list above) that had the same characteristic of a term associated with a single gene (according to the OMIM relationship) having a subclass term associated with a totally different gene. I figured it might be useful to report them as well, hence the ticket. Happy to hop on a quick call to explain the process and findings if it helps.

maglott commented 4 years ago

This would also be an opportunity to review OMIM's phenotypic series. Cleaning up the cases where the MIM number represents a gene-specific concept instead of the general (often a phenotypic series) would help us all.

tnavatar commented 4 years ago

That seems to come into play @maglott . The phenotypic series for this case has the same numeric identifier as the EDS type 1 MIM: PS130000.

mellybelly commented 4 years ago

@cmungall should we have a QC check for leaf nodes that are subclasses of nodes with genes associated? This comes back to the need for a general model that we can all adhere to. In ontological contexts, a parent node should include all genes associated in the leaf nodes, or none at all. We can start this discussion on the list perhaps.

@cboerkoel may also want to weigh in on the EDS case specifically, he's been reviewing the mondo hierarchy there also.

cmungall commented 4 years ago

Thanks @tnavatar

We will make a report of these that we can work through, creating individual tickets for difficult cases (e.g. EDS). Then we will make a validation step in the pipeline that stops these creeping in further.

cmungall commented 4 years ago

Having this implemented will help: https://github.com/ontodev/robot/issues/588

in general we want to know when mondo induces unasserted subclasses in external ontologies and ontology-like artefacts

maglott commented 4 years ago

Is there going to be a general fix for these? Just ran across https://monarchinitiative.org/disease/MONDO:0008897 which should have https://omim.org/entry/211900 /, tumoral calcinosis, HYPERPHOSPHATEMIC, familial, 1; HFTC1 split out.

nicolevasilevsky commented 3 years ago

Sorry for the long delay, I think I know what I need to do now, I will review these terms below and ensure they are not children of classes that are equiv to other OMIM terms.

@maglott I'll address your comment above too

split terms:

done:

These are children of an OMIM phenotypic series, I didn't make any edits to these.