monarch-initiative / dipper

Data Ingestion Pipeline for Monarch
https://dipper.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
57 stars 26 forks source link

Decide on modelling mim2gene association with RO #504

Open drseb opened 7 years ago

drseb commented 7 years ago

Hi,

this ticket is to discuss how to model gene-phenotype information provided by the HPO project using RO. The goal is to have this consistent with other monarch data.

Background I would like to add this to the gene-phenotype data, which is inferred through the gene-disease data (see What is your procedure for associating genes with HPO phenotypes?. Currently I only list in each line of the file which gene is associated with which HPO-term: 8195 MKKS Postaxial hand polydactyly HP:0001162 it is essential for me and others to have information for the strength. So instead of putting the Strings (e.g. ‘susceptibility’) I will put the String+the RO-id. Something like 8195 MKKS Postaxial hand polydactyly HP:0001162 susceptibility (RO:1234567);qtl (RO:2345678)

drseb commented 7 years ago

The information that is available in this column the file ftp://ftp.ncbi.nlm.nih.gov/gene/DATA/README

drseb commented 7 years ago
mbrush commented 7 years ago

Does 'modifier' mean that it affects the severity/presentation of the condition, but does not directly cause it? If so, consider http://purl.obolibrary.org/obo/RO_0003305 (contributes to severity of condition)

Def = a relationship between an entity (e.g. a genotype, genetic variation, chemical, or environmental exposure) and a condition (a phenotype or disease), where the entity influences the severity with which a condition manifests in an individual.

If the definition of this term is not quite right, we can tweak it (if the adjustment is minor), or create a new term (e.g. 'modifies condition'). But I suspect that http://purl.obolibrary.org/obo/RO_0003308 ('correlated with condition') is too general a term for what you want here.

Def = "a relationship between an entity and a condition (phenotype or disease) with which it exhibits a statistical dependence relationship

mbrush commented 7 years ago

For susceptibility, I think http://purl.obolibrary.org/obo/RO_0003306 ("contributes to frequency of condition") is the closes existing relation.

Def = a relationship between an entity (e.g. a genotype, genetic variation, chemical, or environmental exposure) and a condition (a phenotype or disease), where the entity influences the frequency of the condition in a population.

This seems close, in that increased susceptibility would lead to an increased frequency of the disease. But this term is defined for a population, not in terms of the likelihood that single individual getting the condition. Perhaps we need a new term (e.g. 'influences/increases/decreases condition susceptibility') more precisely defined to fit the requirement here? I am happy to implement once a definition is agreed upon.

cmungall commented 6 years ago

Apologies this ticket has been sitting here so long. Perhaps the 3 of us could do a quick call about this?

As an overall strategy, I would tend towards: rather than trying to reuse an existing relation that doesn't quite fit, define a new one that is as specific and unambiguous as it needs to be.

This is what we did to map some of the orphanet relations: http://purl.obolibrary.org/obo/RO_0004010

pnrobinson commented 6 years ago

A call would be good. Note that mutations in Mendelian disease genes also "increase susceptibility" (it is a very large increase) but probably we do not want to structure the hierarchy this way. There is an important spectrum ranging from BRCA mutations to truly polygenic diseases where a variant can have an odds ratio of 1.0001 or whatever. We do not want to model the precise numbers but we might want to consider having a few categories

mbrush commented 6 years ago

Hi all. Had a quick chat with Chris today and we'd like to find a block of 2-3 hours for interested parties to tackle various issues related to G2P/D modeling across Monarch, HPO, and Translator in a comprehensive and coordinated way. These efforts are all dealing with some of the same things right now, and want to be sure we all have a shared understanding of key issues and requirements, so we can create the right solution and move forward in a coordinated way.

Several of us travelling over the next month - so this session likely wont happen until the second half of October. I'll work with Kent in the meantime to start cataloging some of the issues in the Monarch data, so we can present on these. And also document the various approaches to G2P relations taken in different knowledge sources and ontologies (RO, ORDO, BLM, Wikidata, etc). I can add these to Chris google doc if that makes sense. Then we can find a time for a longer session to address these issues next month. Sound good? @cmungall @kshefchek @pnrobinson @drseb

pnrobinson commented 6 years ago

Hi Matt -- I agree this is an extremely important topic.

I would add that it is related to the strategies for MONDO.

Is our current strategy reflected in the (ca 10) relations that MONDO defines? If so, I would also suggest that we provide detailed explanations and examples for this prior to the MONDO meeting. This will partially overlap with the G2P/D issues!

-Peter

Peter Robinson Professor and Donald A. Roux Chair, Genomics and Computational Biology The Jackson Laboratory for Genomic Medicine 860.837.2095 t | peter.robinson@jax.org | https://robinsongroup.github.io/ Peter Robinson


From: Matthew Brush notifications@github.com Sent: Tuesday, September 25, 2018 7:42 PM To: monarch-initiative/dipper Cc: Peter Robinson; Mention Subject: Re: [monarch-initiative/dipper] Decide on modelling mim2gene association with RO (#504)

Hi all. Had a quick chat with Chris today and we'd like to find a block of 2-3 hours for interested parties to tackle various issues related to G2P/D modeling across Monarch, HPO, and Translator in a comprehensive and coordinated way. These efforts are all dealing with some of the same things right now, and want to be sure we all have a shared understanding of key issues and requirements, so we can create the right solution and move forward in a coordinated way.

Several of us travelling over the next month - so this session likely wont happen until the second half of October. I'll work with Kent in the meantime to start cataloging some of the issues in the Monarch data, so we can present on these. And also document the various approaches to G2P relations taken in different knowledge sources and ontologies (RO, ORDO, BLM, Wikidata, etc). I can add these to Chris google doc if that makes sense. Then we can find a time for a longer session to address these issues next month. Sound good? @cmungallhttps://github.com/cmungall @kshefchekhttps://github.com/kshefchek @pnrobinsonhttps://github.com/pnrobinson @drsebhttps://github.com/drseb

- You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/monarch-initiative/dipper/issues/504#issuecomment-424537310, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AEtuPFt30xxc2SUE8BB9S4GfOJzNaJuAks5uer-AgaJpZM4O0mbo.

The information in this email, including attachments, may be confidential and is intended solely for the addressee(s). If you believe you received this email by mistake, please notify the sender by return email as soon as possible.