Closed nlwashington closed 8 years ago
see related NIF ticket https://support.crbs.ucsd.edu/browse/NIF-11883
Latest commit should have most of this wrapped up, but requires review.
@cmungall what exactly would a "disease pathway" be, ontologically? is the disease pathway equivalent to the disease? or something else?
I would say that the disease ontology should be focused on causal models and that the two are equivalent. This is an ongoing ontological debate, I am in favor of the river flow model http://content.iospress.com/articles/applied-ontology/ao147
but we should be guided by practicalities. Here I would tend towards equivalence between DO classes, KEGG pathway classes, PW classhes...
interesting paper, but what do you mean, equivalence of DO and KEGG classes?
KEGG IDs for e.g. Parkinson Disease
On 16 Oct 2015, at 0:38, Peter Robinson wrote:
interesting paper, but what do you mean, equivalence of DO and KEGG classes?
Dr. med. Peter N. Robinson, MSc. Professor of Medical Genomics Professor in the Bioinformatics Division of the Department of Mathematics and Computer Science of the Freie Universität Berlin Institut für Medizinische Genetik und Humangenetik Charité - Universitätsmedizin Berlin Augustenburger Platz 1 13353 Berlin Germany +4930 450566006 Mobile: 0160 93769872 peter.robinson@charite.de http://compbio.charite.de http://www.human-phenotype-ontology.org Introduction to Bio-Ontologies: http://www.crcpress.com/product/isbn/9781439836651 I have learned from my mistakes, and I am sure I can repeat them exactly ORCID ID:http://orcid.org/0000-0002-0736-9199 Scopus Author ID 7403719646 Appointment request: http://doodle.com/pnrobinson
Von: Chris Mungall [notifications@github.com] Gesendet: Freitag, 16. Oktober 2015 07:36 An: monarch-initiative/dipper Betreff: Re: [dipper] add kegg pathways (#89)
I would say that the disease ontology should be focused on causal models and that the two are equivalent. This is an ongoing ontological debate, I am in favor of the river flow model http://content.iospress.com/articles/applied-ontology/ao147
but we should be guided by practicalities. Here I would tend towards equivalence between DO classes, KEGG pathway classes, PW classhes...
— Reply to this email directly or view it on GitHubhttps://github.com/monarch-initiative/dipper/issues/89#issuecomment-148615068.
Reply to this email directly or view it on GitHub: https://github.com/monarch-initiative/dipper/issues/89#issuecomment-148638631
just to be clear, there are kegg disease ids, and kegg pathway ids for those diseases. are you saying that those should be equivalent?
ah, I see, I forgot they were distinct entities in KEGG, even when the pathway is named for the disease. Yes, in the interest of practicality, equivalence should only be disease<->disease.
For the relationship between the pathway and the disease, doing this the "correct" way will require looking at the different kinds of implicit relationships (note: my knowledge of KEGG may be out of date or incorrect), which I think are
This may be OTT for now, some kind broad has-phenotype type relationship may be most practical
when you say "has phenotype relationship", you mean to say like:
KEGG:disease_id RO:has_phenotype KEGG:disease_pathway_id
that doesn't make sense to me. or did you mean
KEGG:disease_pathway_id RO:has_phenotype KEGG:disease_id
?
that makes slightly more sense, but is still odd.
also, i don't think that a pathway is a "normal" pathway; i think that it is in fact showing the progression of a disease, which itself is a deviation from normal. for example, here's the parkinson's "reference" pathway KEGG:map05012, or with the human genes highlighted KEGG:hsa05012, and the disease entry KEGG-ds:H00057
I dont see these pathways being useful in this way. The pathways are "involved in the pathogenesis of" but this is extremely complicated, often unknown, and something that is crying out for a more detailed representation. What actually is the intended use case? -Peter
Dr. med. Peter N. Robinson, MSc. Professor of Medical Genomics Professor of Bioinformatics, Freie Universität Berlin Institut für Medizinische Genetik und Humangenetik Charité - Universitätsmedizin Berlin Augustenburger Platz 1 13353 Berlin Germany +4930 450566006 Mobile: 0160 93769872 peter.robinson@charite.de http://compbio.charite.de http://www.human-phenotype-ontology.org I have learned from my mistakes, and I am sure I can repeat them exactly ORCID ID:http://orcid.org/0000-0002-0736-9199 Scopus Author ID 7403719646 Appointment request: http://doodle.com/pnrobinson
Von: Nicole Washington [notifications@github.com] Gesendet: Montag, 2. November 2015 19:38 An: monarch-initiative/dipper Cc: Robinson, Peter Betreff: Re: [dipper] add kegg pathways (#89)
when you say "has phenotype relationship", you mean to say like: KEGG:disease_id RO:has_phenotype KEGG:disease_pathway_id that doesn't make sense to me. or did you mean KEGG:disease_pathway_id RO:has_phenotype KEGG:disease_id? that makes slightly more sense, but is still odd.
also, i don't think that a pathway is a "normal" pathway; i think that it is in fact showing the progression of a disease, which itself is a deviation from normal. for example, here's the parkinson's "reference" pathway KEGG:map05012http://www.kegg.jp/dbget-bin/www_bget?map05012, or with the human genes highlighted KEGG:hsa05012http://www.kegg.jp/kegg-bin/show_pathway?hsa05012, and the disease entry KEGG-ds:H00057http://www.kegg.jp/dbget-bin/www_bget?ds:H00057
— Reply to this email directly or view it on GitHubhttps://github.com/monarch-initiative/dipper/issues/89#issuecomment-153119554.
here's another example where there's a disease (Pyruvate carboxylase deficiency, KEGG-ds:H00073) that involves two pathways: Pyruvate metabolism (KEGG-path:hsa00620) and Citrate cycle (TCA cycle) (KEGG-path:hsa00020). in this case it's more like the "involoved in the biogenesis of" as @pnrobinson suggests, but I don't think there's a relationship like this in RO. the closest i see is "causally upstream of or within" RO:0002418, which is between processes. i can use this for now, which seems to be the most appropriate.
this is now implemented. if it is wrong, please re/open.
need to import the kegg pathways and gene members. kegg maps its pathways to a uber-gene (kegg ortholog), to which we can link to a species-specific gene via their orthology maps.
here is the relevant stuff: lists of pathways can be found here (this will get us identifiers and labels): http://rest.genome.jp/list/pathway
ortholog classes: http://rest.genome.jp/list/orthology
orthology mapping to gene ids: http://rest.genome.jp/link/orthology/mmu (where the last part is the species prefix... we should get at minimum: hsa, mmu, rno, dme, dre, cel. and more in the future)
gene ids (human): http://rest.genome.jp/list/hsa unfortunately, all genomes have a different prefix, and i don't think the complete list can be obtained together
"reference" pathways, which are annotated with ortholog classes (ko). http://rest.genome.jp/link/pathway/ko
or human genes to pathway map (similar things could be obtained for each species): http://rest.genome.jp/link/pathway/hsa
@cmungall shall we take in the uber-genes (ortholog classes) here? or shall we just figure out the species-specific mapping at ingest time to map a pathway to species-specific genes? (as in, create some kind of monarch-association that has as evidence the inference graph of pathway -> kegg ortholog -> human gene ? and do this for any of the species? happy to keep this as the pathways having the grouping "kegg ortholog" nodes for abstraction.