Closed ValWood closed 3 years ago
See also #722
while subclassing into mono / poly is perfectly logical not clear we can commit to annotating at the desired level
@ValWood do you specifically want monogenic or is a more general mendelian concept encompassing digenic fine?
general mendelian concept encompassing digenic fine?
Well I think this is OK and we can include both , but if the definition already clearly states that a single gene variant is disease-causing ( as in the above), why not instantiate monogenic?
Monogenic is clearly more useful for people working on model yeast because it helps to argue that yeast can be a good model for the mechanism if it isn't only contributing to the diseaeas phenotype. I can really promote the monogenic ones. (but I expected most (or at least of ~850) of the ones we map to me monogenic (everything except the cancer ones, and even a subset of those)
It should be largely inferrable from your end based on definitions?
I'm not clear what the action item is here, @cmungall
I'm a numpty. I thought I had saved the list of DO monogenic diseases, but I saved the MONDO list.
Anyway, a lot of diseases should be easy to classify as monogenic, because a single disease-causing gene is listed in the definition?
All or most of the descendants of inborn disorder of purine metabolism (MONDO:0019236) for example.
All of these: https://www.pombase.org/term/MONDO:0037940
Most "deficiencies" are monogenic. CGL syndrome, anything that has "X-linked" dominant, or recessive in its disease name?
I think we should have around 700 more associations to this term.....
@cmungall and the Mondo team discussed this and we decided to obsolete monogenic disease, and the terms that were previously under monogenic will be under 'inherited genetic disease'.
action item to me:
That's a shame. What's the reason? It's seems really useful to differentiate between single gene and poly genic. I guess it's just too wooly in some cases if you don't know that the disease is only present in a certain genetic background?
So there will be no way to distinguish genes which have a high propensity to cause a disease in multi-factorial diseases from monogenic? I would imagine a lot of users would find this useful, but maybe there is an alternative route.
The justification is that we don't have sufficient information to properly classify diseases under monogenic or digenic.
Related to #722 and https://github.com/monarch-initiative/mondo/pull/972
Looking at the 2 previous tickets. One describes it as "redundant" with inherited, but it isn't really.
Couldn't it be a parent of autosomal dominant, autosomal recessive and maternally inherited (you don't currently have maternally inherited).
Would this be logically correct. Not everything would be classified initially , but would become classified over time as the recessivity and dominance was fleshed out.
Ah, OK do you ever classify multifactorial diseases as inherited disorders? If not (i.e if they are classified as susceptibilities), then we can use "inherited genetic disorder" synonymously with 'inherited genetic disease'
But OK, I guess the majority of 'monogenic diseases' will have modifiers, which might make the classification difficult in some cases. At least we can get at the mendelian disease forms....
The problem is that without some auto-classification, this will never be kept up to date. We need a model where we can classify as monogenetic etc based on the number of genes annotated and/or a known inheritance pattern. It sounds like we need a plan for evolution of how to handle this. Please see the above linked ticket for some discussion. @pnrobinson
probably we should have a comment on 'monogenic disease' that it should not be used for annotation directly?
@mellybelly A disease does not cease to be monogenic if some new associated gene is discovered. This makes the disease genetically heterogeneous. I am not convinced that there is a major maintenance problem that goes beyond the general issue of correcting annotations if some mistake is corrected in the literature or in an upstream database -- this problem exists just as much for an inference based system!
I agree with you. i am not sure the right solution but i think the way things are now its not ideal ;-). Can we come up with a shorter and longer term plan? Maybe this is more on the reporting side than the modeling or annotation/inference side. That might be a safer plan.
I obsoleted 'monogenic disease' on a Pull Request which is currently pending review.
The obsoleted class says to consider 'inherited genetic disease'.
'inherited genetic disease' is not an exact synonym of 'monogenic disease'
I didn't add it as a synonym, just annotated 'consider' inherited genetic disease' on the obsoleted 'monogenic disease' class
From my limited experience, I wasn't convinced there is a major maintenance problem. The vast majority of diseases that currently refer to a single disease gene in the definition will continue to be monogenic because they will still be "attributable to genetic variants with large effects on disease status". New genes involved in the disease in these cases will presumably only be 'disease modifiers'? So, once a disease is classified as monogenic following this criteria, it should very rarely change this designation? so this classification should be very robust? (because we already know "large effects on disease status" is true, and this will remain the same).
i.e new truly disease causing genes will not prevent a currently monogenic disorder being monoogenic.
@mellybelly @cmungall What is "attributable to genetic variants with large effects on disease status". Can we discuss the attributes before we implement? I understand the intent but it would be good to discuss things -- this definition is not optimal.
I would also like clarification of the policy for labels with 'inherited' in the title vs. those without.
Note that this isn't in an existing definition , I quoted it because that's my understanding of the scope of "monogenic" if it was retained. I don't know what an ideal definition would be, but this is a phrase I see when monogenic is discussed. It doesn't sound very precise...
Let's try and separate these issues. @maglott that's an excellent point about inherited + de-novos, can you make a separate ticket for that @nicolevasilevsky?
The central question in this ticket is:
On 2: don't think this is easy to tell from existing gene to disease resources. It can be hard to tell computationally if cardinality>1 is to be interpreted as a can-be-caused-by-either-of or digenic.
I am convinced that as a group we can solve this problem, but I don't think this will happen overnight. I am concerned that in the short term we don't put out misleading incomplete information, or incorrect information. If you put a term in an ontology, you are committed to doing a good job of fully populating it, within reason, otherwise users will be confused.
It seems safer to roll up to whatever we call the grouping class of mono/di for now, and return to this later. But I may be overestimating the curation effort
@cmungall I am not sure but it seems there is a misunderstanding on how to define a disease as monogenic or digenic -- the approach sketched above seems incorrect to me. It would be easier to clear this up in a zoom.
@maglott that's an excellent point about inherited + de-novos, can you make a separate ticket for that @nicolevasilevsky?
done: https://github.com/monarch-initiative/mondo/issues/1643
@pnrobinson we have weekly Mondo meetings on Fridays at 9am PT/12pm ET. Would you be able to join next Friday? Or we could discuss on a Monarch huddle?
cc @monicacecilia @jmcmurry
@ValWood you are welcome to join the Mondo calls anytime too!
@nicolevasilevsky timing has been challenging recently. Hopefully yes in the future!
@ValWood you are welcome to join the Mondo calls anytime too!
I think I would be surplus, but if ever there are a set of my tickets that need clarification I would be happy to join.
The following 210 genes used to be annotated to "monogenic disease" https://www.pombase.org/results/from/id/707b3611-0b71-4ebd-9ee1-9326e44d76c6
Possibly not all are, but most do appear to be from spot checking
For example
SPAC31A2.05c | mis4 | cohesin loading factor (adherin) Mis4/Scc2 SPBC776.13 | cnd1 | condensin complex non-SMC subunit Cnd1 SPCC306.03c | cnd2 | condensin complex non-SMC subunit Cnd2 all have causative mutations for Cornelia de Lange syndrome
ark1 | aurora-B kinase Ark1 spermatogenic failure 5 (DOID:0070183) -has_material_basis_in mutation in the AURKC gene on chromosome 19q13
alg14 | SPAC5D6.06c | alg14 | UDP-GlcNAc transferase associated protein Alg14 congenital myasthenic syndrome 15 compound heterozygous mutation in the ALG14 gene on chromosome 1p21.
SPAC18G6.10 | lem2 | LEM domain nuclear inner membrane protein Heh1/Lem2 cataract 46 juvenile-onset (DOID:0110243) A cataract that has_material_basis_in homozygous mutation in the LEMD2 gene on chromosome 6p21.