monarch-initiative / mckb

Monarch Cancer Knowledge Base
2 stars 1 forks source link

Improve modelling of drug, disease, variant relationship #9

Open kshefchek opened 9 years ago

kshefchek commented 9 years ago

The modelling discussed in #1 could be improved. As suggested by @nlwashington, here is another way to model these relationships:

    Monarch:DiseaseID is an OWL Class
    Monarch:Disease rdfs:label "Adenocarcinoma"
    Monarch:DiseaseInstance is an individual of Monarch:DiseaseID
    Monarch:DiseaseInstance rdfs:label "Adenocarcinoma caused by variant MLH1 any mutation"

    Monarch:DiseaseInstance RO:caused_by Monarch:GenotypeID

    Monarch:DrugID Monarch:detrimental_effect Monarch:AssociationID

    Monarch:AssociationID dc:evidence Traceable Author Statement (ECO:0000033)
    Monarch:AssociationID dc:source PMID:20498393
    Monarch:AssociationID :hasSubject A Monarch:DiseaseInstance
    Monarch:AssociationID :hasPredicate RO:caused_by
    Monarch:AssociationID :hasObject Monarch:GenotypeID

Will update the code to generate the above given a row of data from CGD.

kshefchek commented 9 years ago

Change to:

    Monarch:VariantID has_phenotype(RO:0002200) Monarch:DiseaseInstance

    A Monarch:DrugID has_relationship_to Monarch:DiseaseInstance

    A Monarch:AssociationID dc:evidence Traceable Author Statement (ECO:0000033)
    A Monarch:AssociationID dc:source PMID:20498393
    A Monarch:AssociationID :hasSubject A Monarch:DrugID
    A Monarch:AssociationID :hasPredicate has_relationship_to
    A Monarch:AssociationID :hasObject Monarch:DiseaseInstance
kshefchek commented 9 years ago

Reopening after discussion with @mellybelly. We also want to add the above linkages between variants and drugs. To the above I have changed

    CGD:DiseaseInstance has_relationship_to CGD:Drug
    CGD:Variant has_relationship_to CGD:Drug

I then make two associations and hang the source and evidence off each.

Going to leave this ticket open for now for discussion.

micheldumontier commented 9 years ago

n-ary models naturally lend themselves to arbitrary extension. you might be interested in our paper on PGx modeling: http://bib.oxfordjournals.org/content/10/2/153.long

in the current model, where you reify a triple, why don't you annotate the association with the variant and/or the effect?

kshefchek commented 9 years ago

@micheldumontier are you suggesting linking the variant and disease, and then reifying this relationship and hanging the drug interaction off of this association? If you look at the first post in this thread that is how we originally designed it. @nlwashington is there a reason we added the binary relationship between a disease instance and a drug?

As a side note, directly linking a variant to a drug still feels a bit awkward, I may remove this unless we see a need to keep it.

micheldumontier commented 9 years ago

@kshefchek I think it comes down to what the main assertion is, and its context. So if a variant modulates the response of a drug to treat a disease, then we might simply consider annotating the drug-disease association with the variant and functional effect (rather than annotate the drug or the disease directly)

nlwashington commented 9 years ago

@micheldumontier and @kshefchek yes this is what was going on with the diagramming on our call the other day. @cmungall was drafting up some figures and guidelines on this.

kshefchek commented 9 years ago

I could be lost but is this what we're proposing?

    CGD:VariantID has_phenotype(RO:0002200) CGD:DiseaseInstance

    A CGD:AssociationID dc:evidence Traceable Author Statement (ECO:0000033)
    A CGD:AssociationID dc:source PMID:20498393
    A CGD:AssociationID has_response CGD:DrugID
    A CGD:AssociationID :hasSubject A CGD:VariantID
    A CGD:AssociationID :hasPredicate has_phenotype
    A CGD:AssociationID :hasObject CGD:DiseaseInstance
micheldumontier commented 9 years ago

I don't know this syntax, but I was thinking something like

CGD:DrugID treats CGD:DiseaseInstance

A CGD:AssociationID :hasSubject A CGD:DrugID
A CGD:AssociationID :hasPredicate treats
A CGD:AssociationID :hasObject CGD:DiseaseInstance

A CGD:AssociationID dc:evidence Traceable Author Statement (ECO:0000033)
A CGD:AssociationID dc:source PMID:20498393
A CGD:AssociationID involves/has_context CGD:VariantID

A CGD:AssociationID results_in/has_phenotype CGD:Effect

Michel Dumontier Associate Professor of Medicine (Biomedical Informatics), Stanford University Chair, W3C Semantic Web for Health Care and the Life Sciences Interest Group http://dumontierlab.com

On Wed, Apr 22, 2015 at 4:59 PM, Kent Shefchek notifications@github.com wrote:

I could be lost but is this what we're proposing?

CGD:VariantID has_phenotype(RO:0002200) CGD:DiseaseInstance

A CGD:AssociationID dc:evidence Traceable Author Statement (ECO:0000033)
A CGD:AssociationID dc:source PMID:20498393
A CGD:AssociationID has_response CGD:DrugID
A CGD:AssociationID :hasSubject A CGD:VariantID
A CGD:AssociationID :hasPredicate has_phenotype
A CGD:AssociationID :hasObject CGD:DiseaseInstance

— Reply to this email directly or view it on GitHub https://github.com/monarch-initiative/mckb/issues/9#issuecomment-95371148 .

pnrobinson commented 9 years ago

Sorry, what is CGD? http://en.wikipedia.org/wiki/CGD Also, be careful with the bit about A CGD:AssociationID results_in/has_phenotype CGD:Effect If we are talking about an association study, there are two problems (i) There is obviously no deterministic effect of a variant on a disease (ii) Some SNPs are association with multiple diseases, and the relation "results_in" sounds to me like it should be functional

Another issue. Many people use the word "phenotype" to mean "disease entity", but I think it would be good for us as a group to agree to use "disease" or "disorder" for this and to reserve the word "phenotype" for "phenotypic feature", i.e., something that is described by an HP or MP term.

I would be interested in participating in this discussion, and maybe we should skype at some point? -Peter

Dr. med. Peter N. Robinson, MSc. Professor of Medical Genomics Professor in the Bioinformatics Division of the Department of Mathematics and Computer Science of the Freie Universität Berlin Institut für Medizinische Genetik und Humangenetik Charité - Universitätsmedizin Berlin Augustenburger Platz 1 13353 Berlin Germany +4930 450566006 Mobile: 0160 93769872 peter.robinson@charite.de http://compbio.charite.de http://www.human-phenotype-ontology.org Introduction to Bio-Ontologies: http://www.crcpress.com/product/isbn/9781439836651 I have learned from my mistakes, and I am sure I can repeat them exactly ORCID ID:http://orcid.org/0000-0002-0736-9199 Scopus Author ID 7403719646 Appointment request: http://doodle.com/pnrobinson


Von: Michel Dumontier [notifications@github.com] Gesendet: Donnerstag, 23. April 2015 03:53 An: monarch-initiative/mckb Betreff: Re: [mckb] Improve modelling of drug, disease, variant relationship (#9)

I don't know this syntax, but I was thinking something like

CGD:DrugID treats CGD:DiseaseInstance

A CGD:AssociationID :hasSubject A CGD:DrugID A CGD:AssociationID :hasPredicate treats A CGD:AssociationID :hasObject CGD:DiseaseInstance

A CGD:AssociationID dc:evidence Traceable Author Statement (ECO:0000033) A CGD:AssociationID dc:source PMID:20498393 A CGD:AssociationID involves/has_context CGD:VariantID

A CGD:AssociationID results_in/has_phenotype CGD:Effect

Michel Dumontier Associate Professor of Medicine (Biomedical Informatics), Stanford University Chair, W3C Semantic Web for Health Care and the Life Sciences Interest Group http://dumontierlab.com

On Wed, Apr 22, 2015 at 4:59 PM, Kent Shefchek notifications@github.com wrote:

I could be lost but is this what we're proposing?

CGD:VariantID has_phenotype(RO:0002200) CGD:DiseaseInstance

A CGD:AssociationID dc:evidence Traceable Author Statement (ECO:0000033) A CGD:AssociationID dc:source PMID:20498393 A CGD:AssociationID has_response CGD:DrugID A CGD:AssociationID :hasSubject A CGD:VariantID A CGD:AssociationID :hasPredicate has_phenotype A CGD:AssociationID :hasObject CGD:DiseaseInstance

— Reply to this email directly or view it on GitHub https://github.com/monarch-initiative/mckb/issues/9#issuecomment-95371148 .

— Reply to this email directly or view it on GitHubhttps://github.com/monarch-initiative/mckb/issues/9#issuecomment-95387352.

kshefchek commented 9 years ago

@pnrobinson completely agree that phenotype != disease. A bit of context here, this is a cancer dataset we are testing to generate GA4GH compliant JSON using the genotype to phenotype schema, which it self is still in development (pending pull request). Calling a disease a phenotype here is a modelling "hack" so our server will work out of the box, but this is just a prototype/proof of concept and will not be used in any production environment.