monarch-initiative / mondo

Mondo Disease Ontology
http://obofoundry.org/ontology/mondo
Creative Commons Attribution 4.0 International
224 stars 51 forks source link

Large drop in monogenic diseases after DO-MOndo mapping #1495

Closed ValWood closed 3 years ago

ValWood commented 4 years ago

The following 210 genes used to be annotated to "monogenic disease" https://www.pombase.org/results/from/id/707b3611-0b71-4ebd-9ee1-9326e44d76c6

Possibly not all are, but most do appear to be from spot checking

For example

SPAC31A2.05c | mis4 | cohesin loading factor (adherin) Mis4/Scc2 SPBC776.13 | cnd1 | condensin complex non-SMC subunit Cnd1 SPCC306.03c | cnd2 | condensin complex non-SMC subunit Cnd2 all have causative mutations for Cornelia de Lange syndrome

ark1 | aurora-B kinase Ark1 spermatogenic failure 5 (DOID:0070183) -has_material_basis_in mutation in the AURKC gene on chromosome 19q13

alg14 | SPAC5D6.06c | alg14 | UDP-GlcNAc transferase associated protein Alg14 congenital myasthenic syndrome 15 compound heterozygous mutation in the ALG14 gene on chromosome 1p21.

SPAC18G6.10 | lem2 | LEM domain nuclear inner membrane protein Heh1/Lem2 cataract 46 juvenile-onset (DOID:0110243) A cataract that has_material_basis_in homozygous mutation in the LEMD2 gene on chromosome 6p21.

cmungall commented 4 years ago

See also #722

while subclassing into mono / poly is perfectly logical not clear we can commit to annotating at the desired level

@ValWood do you specifically want monogenic or is a more general mendelian concept encompassing digenic fine?

ValWood commented 4 years ago

general mendelian concept encompassing digenic fine?

Well I think this is OK and we can include both , but if the definition already clearly states that a single gene variant is disease-causing ( as in the above), why not instantiate monogenic?

Monogenic is clearly more useful for people working on model yeast because it helps to argue that yeast can be a good model for the mechanism if it isn't only contributing to the diseaeas phenotype. I can really promote the monogenic ones. (but I expected most (or at least of ~850) of the ones we map to me monogenic (everything except the cancer ones, and even a subset of those)

ValWood commented 4 years ago

It should be largely inferrable from your end based on definitions?

nicolevasilevsky commented 4 years ago

I'm not clear what the action item is here, @cmungall

ValWood commented 4 years ago

I'm a numpty. I thought I had saved the list of DO monogenic diseases, but I saved the MONDO list.

Anyway, a lot of diseases should be easy to classify as monogenic, because a single disease-causing gene is listed in the definition?

All or most of the descendants of inborn disorder of purine metabolism (MONDO:0019236) for example.

All of these: https://www.pombase.org/term/MONDO:0037940

Most "deficiencies" are monogenic. CGL syndrome, anything that has "X-linked" dominant, or recessive in its disease name?

I think we should have around 700 more associations to this term.....

nicolevasilevsky commented 4 years ago

@cmungall and the Mondo team discussed this and we decided to obsolete monogenic disease, and the terms that were previously under monogenic will be under 'inherited genetic disease'.

action item to me:

ValWood commented 4 years ago

That's a shame. What's the reason? It's seems really useful to differentiate between single gene and poly genic. I guess it's just too wooly in some cases if you don't know that the disease is only present in a certain genetic background?

So there will be no way to distinguish genes which have a high propensity to cause a disease in multi-factorial diseases from monogenic? I would imagine a lot of users would find this useful, but maybe there is an alternative route.

nicolevasilevsky commented 4 years ago

The justification is that we don't have sufficient information to properly classify diseases under monogenic or digenic.

Related to #722 and https://github.com/monarch-initiative/mondo/pull/972

ValWood commented 4 years ago

Looking at the 2 previous tickets. One describes it as "redundant" with inherited, but it isn't really.

Couldn't it be a parent of autosomal dominant, autosomal recessive and maternally inherited (you don't currently have maternally inherited).

Would this be logically correct. Not everything would be classified initially , but would become classified over time as the recessivity and dominance was fleshed out.

ValWood commented 4 years ago

Ah, OK do you ever classify multifactorial diseases as inherited disorders? If not (i.e if they are classified as susceptibilities), then we can use "inherited genetic disorder" synonymously with 'inherited genetic disease'

ValWood commented 4 years ago

But OK, I guess the majority of 'monogenic diseases' will have modifiers, which might make the classification difficult in some cases. At least we can get at the mendelian disease forms....

mellybelly commented 4 years ago

The problem is that without some auto-classification, this will never be kept up to date. We need a model where we can classify as monogenetic etc based on the number of genes annotated and/or a known inheritance pattern. It sounds like we need a plan for evolution of how to handle this. Please see the above linked ticket for some discussion. @pnrobinson

mellybelly commented 4 years ago

probably we should have a comment on 'monogenic disease' that it should not be used for annotation directly?

pnrobinson commented 4 years ago

@mellybelly A disease does not cease to be monogenic if some new associated gene is discovered. This makes the disease genetically heterogeneous. I am not convinced that there is a major maintenance problem that goes beyond the general issue of correcting annotations if some mistake is corrected in the literature or in an upstream database -- this problem exists just as much for an inference based system!

mellybelly commented 4 years ago

I agree with you. i am not sure the right solution but i think the way things are now its not ideal ;-). Can we come up with a shorter and longer term plan? Maybe this is more on the reporting side than the modeling or annotation/inference side. That might be a safer plan.

nicolevasilevsky commented 4 years ago

I obsoleted 'monogenic disease' on a Pull Request which is currently pending review.

The obsoleted class says to consider 'inherited genetic disease'.

pnrobinson commented 4 years ago

'inherited genetic disease' is not an exact synonym of 'monogenic disease'

nicolevasilevsky commented 4 years ago

I didn't add it as a synonym, just annotated 'consider' inherited genetic disease' on the obsoleted 'monogenic disease' class

ValWood commented 4 years ago

From my limited experience, I wasn't convinced there is a major maintenance problem. The vast majority of diseases that currently refer to a single disease gene in the definition will continue to be monogenic because they will still be "attributable to genetic variants with large effects on disease status". New genes involved in the disease in these cases will presumably only be 'disease modifiers'? So, once a disease is classified as monogenic following this criteria, it should very rarely change this designation? so this classification should be very robust? (because we already know "large effects on disease status" is true, and this will remain the same).

i.e new truly disease causing genes will not prevent a currently monogenic disorder being monoogenic.

pnrobinson commented 4 years ago

@mellybelly @cmungall What is "attributable to genetic variants with large effects on disease status". Can we discuss the attributes before we implement? I understand the intent but it would be good to discuss things -- this definition is not optimal.

maglott commented 4 years ago

I would also like clarification of the policy for labels with 'inherited' in the title vs. those without.

ValWood commented 4 years ago

Note that this isn't in an existing definition , I quoted it because that's my understanding of the scope of "monogenic" if it was retained. I don't know what an ideal definition would be, but this is a phrase I see when monogenic is discussed. It doesn't sound very precise...

cmungall commented 4 years ago

Let's try and separate these issues. @maglott that's an excellent point about inherited + de-novos, can you make a separate ticket for that @nicolevasilevsky?

The central question in this ticket is:

  1. does it make sense to differentiate between monogenic and digenic via bucket classes in the ontology
  2. if so, how do we ensure the ontology stays up to date with current findings

On 2: don't think this is easy to tell from existing gene to disease resources. It can be hard to tell computationally if cardinality>1 is to be interpreted as a can-be-caused-by-either-of or digenic.

I am convinced that as a group we can solve this problem, but I don't think this will happen overnight. I am concerned that in the short term we don't put out misleading incomplete information, or incorrect information. If you put a term in an ontology, you are committed to doing a good job of fully populating it, within reason, otherwise users will be confused.

It seems safer to roll up to whatever we call the grouping class of mono/di for now, and return to this later. But I may be overestimating the curation effort

pnrobinson commented 4 years ago

@cmungall I am not sure but it seems there is a misunderstanding on how to define a disease as monogenic or digenic -- the approach sketched above seems incorrect to me. It would be easier to clear this up in a zoom.

nicolevasilevsky commented 4 years ago

@maglott that's an excellent point about inherited + de-novos, can you make a separate ticket for that @nicolevasilevsky?

done: https://github.com/monarch-initiative/mondo/issues/1643

nicolevasilevsky commented 4 years ago

@pnrobinson we have weekly Mondo meetings on Fridays at 9am PT/12pm ET. Would you be able to join next Friday? Or we could discuss on a Monarch huddle?

cc @monicacecilia @jmcmurry

nicolevasilevsky commented 4 years ago

@ValWood you are welcome to join the Mondo calls anytime too!

pnrobinson commented 4 years ago

@nicolevasilevsky timing has been challenging recently. Hopefully yes in the future!

ValWood commented 4 years ago

@ValWood you are welcome to join the Mondo calls anytime too!

I think I would be surplus, but if ever there are a set of my tickets that need clarification I would be happy to join.