monarch-initiative / mondo

Mondo Disease Ontology
http://obofoundry.org/ontology/mondo
Creative Commons Attribution 4.0 International
234 stars 53 forks source link

OrphaNet IDs and one (of multiple) OMIM IDs lumped into same term #2693

Closed kanems closed 2 years ago

kanems commented 3 years ago

I've noticed this a few times, I'll give one example below: (Ok, this is a problematic example because the disease name from OMIM is rather generic, but I'll comment with additional examples later as I run into them.) Mondo term (ID Label) https://www.ebi.ac.uk/ols/ontologies/mondo/terms?iri=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FMONDO_0008345 MONDO:0008345 "idiopathic pulmonary fibrosis"

Bug/Typo/Error description The OrphaNet ID on this term Orphanet:2032 has multiple OMIM terms as its equivalents on OrphaNet's site https://www.orpha.net/consor/cgi-bin/Disease_Search.php?lng=EN&data_id=7029&Disease_Disease_Search_diseaseGroup=2032&Disease_Disease_Search_diseaseType=ORPHA&Disease(s)/group%20of%20diseases=Idiopathic-pulmonary-fibrosis&title=Idiopathic%20pulmonary%20fibrosis&search=Disease_Search_Simple The entry on MONDO uses only ONE OMIM ID: OMIM:178500

It seems like when an OrphaNet ID references multiple specific subtypes from OMIM or the Orphanet page has multiple genetic causes, then a gene-specific type should be set as a child of the term with the OrphaNet ID.

Your nano-attribution (ORCID) If you don't have an ORCID, you can sign up for one here

nicolevasilevsky commented 3 years ago

Thanks @kanems!

In this case, Orphanet xrefs 3 OMIM terms, as you noted. The two more specific terms (OMIM:616371 PFBMFT4 and OMIM:616373 PFBMFT3) are part of a phenotypic series (see here), which corresponds to MONDO_0000148 'pulmonary fibrosis and/or bone marrow failure, telomere-related'.

These gene specific classes are also sub-classified under MONDO_0008345 'idiopathic pulmonary fibrosis', because of the Orphanet xrefs.

I drew you a picture: image

I think the gene-specific type is already a child of the term with the OrphaNet ID? Please correct me if I am wrong. :)

kanems commented 3 years ago

image It looks to me that the way the X-refs are displayed in the EBI browser, the MIM number and the OrphaNet ID are coming in together (and cite one another as the source), but the OrphaNet page has 3 different OMIM numbers and only one is attached to this term, which is the recurring pattern/problem I wanted to document. This is not a great example because it's not so straight forward. I'll post another example of this pattern in another comment. As I am looking at this OMIM term (178500) it is a bit of a mix of things (IPF, Hamman-Rich disease, susceptibility to IFP..) but it points to two genes: SFTPA2 and MUC5B. (And the MUC5B is a susceptibility phenotype.) But OrphaNet cites 14 different genes as associated with their IPF term. So on that basis alone, I would guess the OMIM IPF is a child of the OrphaNet IPF term. If we only care about the disease names, then IPF is IPF, regardless of the source and the MONDO terms are fine. So whatever MONDO, OrphaNet @ana-rath-orphanet and @ahamosh OMIM teams decide, we will try to reflect in the MedGen database.

kanems commented 3 years ago

Another instance of an OrphaNet disease class and ONE OMIM ref coming in on the same term: MONDO:0007926 Waldenstrom macroglobulinemia the Orphanet term points to two OMIM numbers: https://www.orpha.net/consor/cgi-bin/Disease_Search.php?lng=EN&data_id=10313&Disease_Disease_Search_diseaseGroup=33226&Disease_Disease_Search_diseaseType=ORPHA&Disease(s)/group%20of%20diseases=Waldenstrom-macroglobulinemia&title=Waldenstr%F6m%20macroglobulinemia&search=Disease_Search_Simple But the two OMIM numbers are specific to WM1 and WM2. Right now WM1 is rolled in with the parent concept, with the MIM # coming in with/citing the OrphaNet ID (see below). (Also the WM2 term MONDO:0012491 needs the full name on the MONDO entry, right now it's only the abbreviation "WM2") (I keep forgetting this was already mentioned on another ticket and you have acknowledged this issue @nicolevasilevsky, sorry to keep harping on it, I've been working on cleaning a lot of terms in MedGen and it's all running together!) image

ahamosh commented 3 years ago

this is tough example. Idiopathic pulmonary fibrosis as an isolate entity is 178500 in OMIM. IPF is a prominent feature (and sometimes the only one) of telomere disorders that can include liver disease and bone marrow failure. Because of telomere shortening these diseases also show anticipation with more and earlier manifestations in subsequent generations. PS614742 is the phenotypic series for the telomere disorders. Pulmonary fibrosis is also a feature of many other conditions. 178500 is the only OMIM entry with isolated pulmonary fibrosis. I will look into splitting the two entries based upon the gene. Many orpha terms will be a parent to OMIM diseases separated by molecular basis (e.g. Orpha:648, with all the different Noonan syndromes within, this orpha code should be equivalent to OMIM PS13950.

ahamosh commented 3 years ago

The waldenstrom example is because OMIM hasn't made a phenotypic series. Discussion of genetic heterogeneity always resides in the lowest number of the series.

nicolevasilevsky commented 3 years ago

related ticket: https://github.com/monarch-initiative/mondo/issues/2562#issuecomment-783460282

kanems commented 3 years ago

Another example: MONDO:0007326 paroxysmal nonkinesigenic dyskinesia 1 X-refs Orpha and one OMIM: OMIM:118800 (Orphanet:98810) ... Orphanet:98810 (OMIM:118800) But the OrphaNet term has two OMIM X-refs (PNKD1 and PNKD1 OMIM: 118800 611147)

Suggested clean up: Keep MIM118800 here. Move Orphanet:98810 (and GARD 8722) to a new parent term of "Paroxysmal non-kinesigenic dyskinesia" (with subtypes 1 and 2 from OMIM)

ahamosh commented 3 years ago

Yes. This is how to do this.

nicolevasilevsky commented 3 years ago

Suggested clean up: Keep MIM118800 here. Move Orphanet:98810 (and GARD 8722) to a new parent term of "Paroxysmal non-kinesigenic dyskinesia" (with subtypes 1 and 2 from OMIM)

Got it! This makes sense, I will do this.

My action items/note to self:

nicolevasilevsky commented 3 years ago

@kanems

nicolevasilevsky commented 3 years ago

to do:

kanems commented 3 years ago

Another example: ORPHA:540 "Familial hemophagocytic lymphohistiocytosis" is currently mapped to MONDO:0009974 "familial hemophagocytic lymphohistiocytosis type 1" The Orpha:540 term maps to OMIM: 267700 , 603552 , 603553 , 608898 , 613101 ON MONDO:0009974, there is only 267700. Suggest making a new term for ORPHA:540 "Familial hemophagocytic lymphohistiocytosis", then setting the MIM subtypes as children of the orphaNet term.

ahamosh commented 3 years ago

There is a phenotypic series: PS267700 which should map to the Orpha and Mondo parents. The five you note above are the children split by molecular basis (gene).

kanems commented 3 years ago

Another example: Orphanet:2924 is currently mapped to MONDO:0008265 polycystic liver disease 1, but I think it should probably go on MONDO:0000447 autosomal dominant polycystic liver disease; based on multiple OMIM mappings and same pref name on Mondo term with one of the Orphanet alt names.

lcdaugherty commented 3 years ago

@nicolevasilevsky I would also agree with @kanems suggestion, and this would help resolve some internal DB issues we are having with PLD

Another example: Orphanet:2924 is currently mapped to MONDO:0008265 polycystic liver disease 1, but I think it should probably go on MONDO:0000447 autosomal dominant polycystic liver disease; based on multiple OMIM mappings and >same pref name on Mondo term with one of the Orphanet alt names.

nicolevasilevsky commented 3 years ago

@sabrinatoro we can work on this together

sabrinatoro commented 3 years ago

@kanems I'm Sabrina, a new curator who recently joined the Mondo team. We will address your issue as soon as possible. Thank you for your patience. If this is a high priority, please let me know and we can prioritize it. Thank you!

lcdaugherty commented 3 years ago

@sabrinatoro I am very interested in getting MONDO:0008265 polycystic liver disease 1 resolved (as described above by @kanems). Thankyou

sabrinatoro commented 3 years ago

Individual tickets were created for each issue reported here:

Issues will be addressed in their own tickets, and therefore this 'parent ticket' can be closed. @nicolevasilevsky

sabrinatoro commented 3 years ago

reopening so we remember the sub-issues remaining

nicolevasilevsky commented 3 years ago

Looks like all the sub issues are done. Can this be closed?

kanems commented 2 years ago

Yes, I think this can be closed. If I see any other instances, I'll open a separate ticket for it.

nicolevasilevsky commented 2 years ago

great, thanks!