DiseaseOntology / HumanDiseaseOntology

Repository for the Human Disease Ontology.
Creative Commons Zero v1.0 Universal
347 stars 108 forks source link

review handling of OMIM susceptibility terms #711

Closed sbello closed 4 years ago

sbello commented 5 years ago

Following from our conversation with Ada and Joanna on 6/10/19 we need to review our handling of the OMIM susceptibility terms to determine which ones should be converted to 'disease #'.

We may want to consider how to properly handle the gene to disease relationship to make it clear that these are cases where a mutation in gene X does not always lead to development of the disease.

sbello commented 5 years ago

The example we discussed with Ada and Joanna was BRCA1 and BRCA2 and breast-ovarian cancer The OMIM record 604370 is labeled 'BREAST-OVARIAN CANCER, FAMILIAL, SUSCEPTIBILITY TO, 1' the phenotype to gene relationship is '{Breast-ovarian cancer, familial, 1}'. Ada thought that the way the DO is currently handling this by creating a 'contributes to condition' relationship between OMIM 604370 is incorrect. The OMIM record represents the disease developed by patients it is the connection to the gene that is a susceptibility factor and this is represented by the {} in the phenotype to gene relationship.

I think what we would want to do is create a DO term for 'breast-ovarian cancer 1' synonym BROVCA1 that represents the disease that is developed by some patients with a mutation in BRCA1. We would want to use a different relationship between the gene and the disease in DO to capture that the mutation in BRCA1 makes it more likely that the patient may develop breast cancer but does not mean that the mutation in BRCA1 is the only gene underlying the development of the cancer in these patients, additional factors are required.

Ada did not like the term 'contributes to condition' ( http://purl.obolibrary.org/obo/RO_0003304) but I'm not certain if that was an issue of semantics rather than what the term means in the RO.

There is also the child term 'contributes to frequency of condition' http://purl.obolibrary.org/obo/RO_0003306 that might be appropriate. definition: A relationship between an entity (e.g. a genotype, genetic variation, chemical, or environmental exposure) and a condition (a phenotype or disease), where the entity influences the frequency of the condition in a population.

@lschriml Do you have an RO term you plan to use for the environmental risk factors?

sbello commented 5 years ago

Another example is OMIM:606874 'HIRSCHSPRUNG DISEASE, SUSCEPTIBILITY TO, 6; HSCR6' gene-phenotype relationship in this case is '{Hirschsprung disease, susceptibility to, 6}'.

This is currently associated to a locus not a gene.

Unlike for BRCA1, or cancer in general, where having an in-born first hit increases the probability that a patient will develop cancer, for Hirschsprung disease I think OMIM is using susceptibility to represent genes or loci associated with the development of a complex disease. See the section:

'Isolated HSCR appears to be of complex nonmendelian inheritance with low sex-dependent penetrance and variable expression according to the length of the aganglionic segment, suggestive of the involvement of one or more genes with low penetrance (Amiel et al., 2008). ' in the OMIM entry.

See also https://www.omim.org/entry/142623 for more details. Reference in both OMIM record is https://www.ncbi.nlm.nih.gov/pubmed/17965226 In this case we may not want to create children of Hirschsprung disease for each of the 'susceptibility' loci. It may be better to create relations between the genes/loci and the generic Hirschsprung disease entry in DO. One complication is that many of the susceptibility entries in OMIM are loci not genes and do not have an independent entry in OMIM.

Possibly a way forward is to continue to use the current approach of linking an OMIM susceptibility entry to a DO terms using contributes to condition for complex diseases but switch to creating independent disease entries when the OMIM susceptibility entry represents a case where the gene-phenotype relationship represents an increase in the probability that a patient will develop a disease (mostly cancers)?

sbello commented 5 years ago

Thinking about this another way we have (so far) 2 types of susceptibility entries:

  1. Disease commonly has at least partly a non-genetic factor and the genetic factor in the OMIM record increases the sensitivity of the patient to the non-genetic factor

  2. Disease does not commonly have a non-genetic factor but is the result of multiple genetic factors

I suspect we will also have a third class that is genes/loci of unclear significance.

lschriml commented 5 years ago

Reading through the various RO options, I think we should stick with 'contributes to condition' for both the gene/genetic and environmental risk factors.

We want to model: environmental exposure 'contributes to condition' some 'DO disease' or genetic risk factor 'contributes to condition' some 'DO disease'

        We can use property restrictions:

                some, only & min, exactly, max,

To specify the genetic risk factor(s): We can follow the model we put in place for complex diseases: 'loss_of_function_variant'

'contributes to condition' exactly 1 (loss_of_function_variant and 'located in' exactly 1 gene)

--> I had not planned on adding in the more specific gene names, such as:

'contributes to condition' exactly 1 (loss_of_function_variant and 'located in' exactly 1 CFTR gene)

For complex diseases, we have been modeling the genetic risk factors: (see Prader Willi) --> when the causal relationship has been determined:

has material basis in min 1 (loss_of_function_variant and maternal_uniparental_disomy or chromosomal_deletion and loss_of_function_variant and paternal_variant or chromosomal_translocation and loss_of_function_variant or loss_of_function_variant and located in min 1 gene)

Or if only one genetic variant is involved: 'has material basis in' exactly 1 (loss_of_function_variant and 'located in' exactly 1 gene)

We have this term in the DO: DOID:5683 hereditary breast ovarian cancer syndrome see NCI: https://ncit.nci.nih.gov/ncitbrowser/ConceptReport.jsp?dictionary=NCI_Thesaurus&ns=ncit&code=C8493

and Hereditary breast and ovarian cancer syndrome (HBOC) is an adult-onset, cancer predisposition syndrome. from: https://www.jax.org/education-and-learning/clinical-and-continuing-education/cancer-resources/hereditary-breast-and-ovarian-cancer-syndrome-factsheet

We can add the OMIM IDs as xrefs to this DOID: OMIM:604370 and synonym: 'familial breast-ovarian cancer 1' OMIM:613399 and synonym: 'familial breast-ovarian cancer 2' OMIM:612555 and synonym: 'familial breast-ovarian cancer 3' OMIM:614291 and synonym: 'familial breast-ovarian cancer 4'

--> I would like to avoid adding in the OMIM acronyms, as this causes parsing/overlapping content errors during releases.

Cheers, Lynn

lschriml commented 5 years ago

For the susceptibility terms, in general, we can treat the different kinds as 'defined phenotype sets' and create xrefs to the DO term. Thus we would not encode the types of susceptibility entries.

For Hirschsprung disease --> based on Genetics Home Reference, https://ghr.nlm.nih.gov/condition/hirschsprung-disease -->. we can create two subtypes short-segment Hirschsprung disease long-segment Hirschsprung disease

There are two main types of Hirschsprung disease, known as short-segment disease and long-segment disease, which are defined by the region of the intestine lacking nerve cells. In short-segment disease, nerve cells are missing from only the last segment of the large intestine. This type is most common, occurring in approximately 80 percent of people with Hirschsprung disease.

Cheers, Lynn

sbello commented 5 years ago

The problem with not entering acronyms is that users often use these to refer to the disease and will want to search by these.

sbello commented 5 years ago

The 2008 reference (https://www.ncbi.nlm.nih.gov/pubmed/17965226) mentions the long and short form classification but also mentions what I think is a second classification scheme described as "Four HSCR variants have been reported: (1) total colonic aganglionosis (TCA, 3–8% of cases)17; (2) total intestinal HSCR when the whole bowel is involved17; (3) ultra-short segment HSCR involving the distal rectum below the pelvic floor and the anus18; (4) suspended HSCR, a controversial condition, where a portion of the colon is aganglionic above a normal distal segment.". However, I don't see much followup for the second scheme in other papers. The long/short split is very commonly mentioned See https://www.ncbi.nlm.nih.gov/pubmed/22174542 for a more recent update on the genetics of HSCR There is also an interesting 2011 paper on gene-environment interactions in HSCR https://www.ncbi.nlm.nih.gov/pubmed/26997034

sbello commented 5 years ago

@lschriml I've changed the glioma susceptibility series to the new pattern. Please take a look and make sure that you are happy with how this is working in practice. The change means that the DO term malignant glioma (DOID:3070) now has multiple subclass axioms for each of the susceptibility terms.

lschriml commented 5 years ago

For malignant glioma, --> I think the plan is to move the OMIM Phenotype records, with 'susceptibility names' out of the omim_susceptibility import. --> to make these new DO disease terms, as subtypes -- removing the 'susceptibility' part of the disease name --> we would be defining new DO terms, rather than adding 'SubClassOf axioms'

so : malignant glioma glioma 1.
A malignant glioma that has_material_basis_in a mutation in the TP53 gene
on chromosome 17p13. dbxref: OMIM:137800

glioma 2.
A malignant glioma that has_material_basis_in a mutation in the PTEN gene on chromosome 10q23. dbxref: OMIM:613028

Cheers, Lynn ...

sbello commented 5 years ago

Ah, okay. That was not what I thought we had decided to do, glad I checked :) It makes sense to me for the glioma ones but then I have a question about the definition. Do we plan modify the definition format for these cases? I don't know that we would want to use has material basis in for these. For example for OMIM:137800 initially talks about gliomas in context of Li-Fraumeni syndrome-1 and OMIM:613028 talks about glioma in the context of a tumor predisposition syndrome causes by mutation in PTEN.

Looking at the notes from our 6/27/19 call we had said:

Genetic mutations: 1 hit (OMIM record for phenotype)

  → treat as a risk factor 

BRCA1 - example

hereditary breast ovarian cancer syndrome DOID:5683

Logical definitions: SubClassOf: ‘Contributes to condition’ (risk factor) ‘OMIM term - name’ OMIM Import: ID and name
ROBOT: can update import when name changes for ID, check for
name changes

       4 phenotype clusters: 
       OMIM IDs as SubClassOf logical defs to this DOID:

OMIM:604370 and synonym: 'familial breast-ovarian cancer 1' OMIM:613399 and synonym: 'familial breast-ovarian cancer 2' OMIM:612555 and synonym: 'familial breast-ovarian cancer 3' OMIM:614291 and synonym: 'familial breast-ovarian cancer 4'

Would we want to create DO subtype terms for this example as well? Instead of using the sub class axiom plan?

sbello commented 5 years ago

This would also apply to my Alzheimer's ticket #736

lschriml commented 5 years ago

Good we are discussing this. Quite complicated. I see your point about using 'has material basis in' for the definition.

When we want to make the connection to a genetic risk factor, I think we should use: 'contributes to condition' OMIM term, where the OMIM is a gene record

However, for OMIM phenotypes, I think we can proceed with the usual plan for OMIM phenotypes, using 'has material basis in' .... mutation

--> and remove the 'susceptibility' part of the name.

This would then result in the OMIM IDs being refs directly on the DO record.

Cheers, Lynn

On Mon, Aug 12, 2019 at 2:57 PM Sue Bello notifications@github.com wrote:

This would also apply to my Alzheimer's ticket #736 https://github.com/DiseaseOntology/HumanDiseaseOntology/issues/736

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DiseaseOntology/HumanDiseaseOntology/issues/711?email_source=notifications&email_token=ABBB4DKIDBQ7MFZJE4QKUYLQEGXABA5CNFSM4HW74BIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4DP3PA#issuecomment-520551868, or mute the thread https://github.com/notifications/unsubscribe-auth/ABBB4DJKDM4UFO6ZPTGNS4DQEGXABANCNFSM4HW74BIA .

-- Lynn M. Schriml, Ph.D. Associate Professor

Institute for Genome Sciences University of Maryland School of Medicine Department of Epidemiology and Public Health 670 W. Baltimore St., HSFIII, Room 3061 Baltimore, MD 21201 P: 410-706-6776 | F: 410-706-6756 lschriml@som.umaryland.edu

sbello commented 5 years ago

I'm going to work through adding new glioma terms in a new worksheet OMIM_susceptibility_update on the MGIROBOTtemplate_2019 workbook. I would like to discuss these and maybe another example during our next call just to make sure I really understand. Thanks, Sue

lschriml commented 5 years ago

Great idea !!

On Mon, Aug 12, 2019 at 3:22 PM Sue Bello notifications@github.com wrote:

I'm going to work through adding new glioma terms in a new worksheet OMIM_susceptibility_update on the MGIROBOTtemplate_2019 workbook. I would like to discuss these and maybe another example during our next call just to make sure I really understand. Thanks, Sue

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DiseaseOntology/HumanDiseaseOntology/issues/711?email_source=notifications&email_token=ABBB4DOMIXGZNYO5HWQNEJLQEGZ6LA5CNFSM4HW74BIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4DR5DY#issuecomment-520560271, or mute the thread https://github.com/notifications/unsubscribe-auth/ABBB4DPNJZZJFNC2RDJNIKTQEGZ6LANCNFSM4HW74BIA .

-- Lynn M. Schriml, Ph.D. Associate Professor

Institute for Genome Sciences University of Maryland School of Medicine Department of Epidemiology and Public Health 670 W. Baltimore St., HSFIII, Room 3061 Baltimore, MD 21201 P: 410-706-6776 | F: 410-706-6756 lschriml@som.umaryland.edu

sbello commented 5 years ago

I've worked through glioma and psoriasis as 2 examples of different types of susceptibility records. Note the glioma ones are all 'glioma susceptibility #' while the psoriasis ones are 'psoriasis #, susceptibility to' We may want to think about handling these groups differently.

Glioma terms:

  1. These are split between glioma in the context of a tumor susceptibility syndrome and records for low risk loci that increase risk for glioma when co-inherited (highlighted in yellow in the spreadsheet).
  2. The OMIM records are currently associated with 'malignant glioma' but the records themselves are not specific to malignant gliomas so I'm not sure we want to create 'malignant glioma #' DO terms for these. I'm inclined to say we don't want to create DO terms based off these OMIM records. We may even want to remove the contributes to condition relationships for these. It would be helpful to have some cancer experts weigh in on this

Psoriasis terms:

  1. With these I'm more comfortable creating DO subtype terms
  2. Most of these are regions and a few have a primary candidate genes that I tried to work into the definition. Please take a look at the definitions in the spreadsheet to let me know what you think.

Looking at this more I'm wondering if we should try to draw a distinction between major/common loci and more minor loci. The review I posted below talks more about the multifactorial nature of psoriasis and how some susceptibility loci only found in conjunction with other loci. I don't know that we really want to break these out into separate diseases.

I'm thinking that we may want to try to draw a line between 1) susceptibility loci that can either cause the disease on their own or in conjunction with environmental factors and 2) susceptibility loci that act primarily in conjunction with other genetic factors. Things that fall in group 1 would get a new DO term, things that fall in group 2 would not.

lschriml commented 5 years ago

Let's plan a call for next week, to discuss in person. When are you available ?

Cheers, Lynn

On Wed, Aug 14, 2019 at 2:37 PM Sue Bello notifications@github.com wrote:

I've worked through glioma and psoriasis as 2 examples of different types of susceptibility records. Note the glioma ones are all 'glioma susceptibility #' while the psoriasis ones are 'psoriasis #, susceptibility to' We may want to think about handling these groups differently.

Glioma terms:

  1. These are split between glioma in the context of a tumor susceptibility syndrome and records for low risk loci that increase risk for glioma when co-inherited (highlighted in yellow in the spreadsheet).
  2. The OMIM records are currently associated with 'malignant glioma' but the records themselves are not specific to malignant gliomas so I'm not sure we want to create 'malignant glioma #' DO terms for these. I'm inclined to say we don't want to create DO terms based off these OMIM records. We may even want to remove the contributes to condition relationships for these. It would be helpful to have some cancer experts weigh in on this

Psoriasis terms:

  1. With these I'm more comfortable creating DO subtype terms
  2. Most of these are regions and a few have a primary candidate genes that I tried to work into the definition. Please take a look at the definitions in the spreadsheet to let me know what you think.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/DiseaseOntology/HumanDiseaseOntology/issues/711?email_source=notifications&email_token=ABBB4DPIMMGS5OYC5RYD4KDQERGGRA5CNFSM4HW74BIKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4JWZ4I#issuecomment-521366769, or mute the thread https://github.com/notifications/unsubscribe-auth/ABBB4DJ3TDZZ5KZBTEGOMZLQERGGRANCNFSM4HW74BIA .

-- Lynn M. Schriml, Ph.D. Associate Professor

Institute for Genome Sciences University of Maryland School of Medicine Department of Epidemiology and Public Health 670 W. Baltimore St., HSFIII, Room 3061 Baltimore, MD 21201 P: 410-706-6776 | F: 410-706-6756 lschriml@som.umaryland.edu

sbello commented 5 years ago

Useful review for psoriasis https://www.ncbi.nlm.nih.gov/pubmed/31130981

sbello commented 5 years ago

It appears that many of the epilepsy related susceptibility terms have been changed to standard disease terms in OMIM.

lschriml commented 5 years ago

Discussed, determined malignant gliomas not separate diseases. Will remain as susceptibility terms.

lschriml commented 4 years ago

Looks like the activity on this ticket is complete. @sbello Are there open issues ?