monarch-initiative / monarch-disease-ontology-RETIRED

THIS IS THE OLD REPO: Use this one instead: https://github.com/monarch-initiative/mondo-build
https://github.com/monarch-initiative/mondo-build
17 stars 9 forks source link

genes are getting into the disease file #14

Closed cmungall closed 9 years ago

cmungall commented 9 years ago

From @mellybelly on March 14, 2015 21:53

For example: OMIM_107730 Apolipoprotein B; APOB subclass of DO:disease and Orphanet_121386 (no label in file, but is a gene from Orphanet apolipoprotein B)

coming from MGI file I think, genes need to be pruned out upstream.

Copied from original issue: monarch-initiative/human-disease-ontology#14

cmungall commented 9 years ago

From @sbello on March 16, 2015 15:22

OMIM:107730 is a gene + phenotype record, so it includes diseases even though it is named for a gene. This is why it is in the MGI disease cluster file. OMIM is working on breaking these apart into separate records. Sue

cmungall commented 9 years ago

Reopening. May have been premature to filter these from mondo

cmungall commented 9 years ago

OK, we we have entries such as OMIM:107730, which are apparently combined G+P entries that may be split in the future (we also have HPO annotations for OMIM:107730 in Monarch).

Then we also have cases like this one: http://monarchinitiative.org/disease/OMIM:516000 'Complex I, Subunit Nd1' which seems much more in the gene camp For which MGI (sensibly) excludes from omimclusters. But note above, we do have phenotype data for.

How should we treat these in mondo? I think we need to at least provide a label. But even a classification under 'disease' is potentially confusing.

cmungall commented 9 years ago

From @mellybelly on March 18, 2015 14:1

@pnrobinson @drseb should these be migrated to a disease class? for OMIM:516000 possibly LEBER OPTIC ATROPHY?

for OMIM107730 perhaps HYPOBETALIPOPROTEINEMIA? Even if OMIM has genes and diseases mixed up doesn't mean we have to annotate to them, especially as many of these terms could come from DO, orphanet, or DC. We should not annotate phenotypes to genes methinks.

cmungall commented 9 years ago

Or if we really want to say 'mutations in this gene cause this phenotype' then we should use NCBIGene

On 18 Mar 2015, at 7:01, Melissa Haendel wrote:

@pnrobinson @drseb should these be migrated to a disease class? for OMIM:516000 possibly LEBER OPTIC ATROPHY?

for OMIM107730 perhaps HYPOBETALIPOPROTEINEMIA? Even if OMIM has genes and diseases mixed up doesn't mean we have to annotate to them, especially as many of these terms could come from DO, orphanet, or DC. We should not annotate phenotypes to genes methinks.


Reply to this email directly or view it on GitHub: https://github.com/monarch-initiative/human-disease-ontology/issues/14#issuecomment-82986929

cmungall commented 9 years ago

From @drseb on March 27, 2015 14:9

Are we sure that for every gene G, for which there is some phenotype (HPX) associated, there is an corresponding OMIM phenotype-entry (OMIMX)? Is OMIMX annotated with HPX? Is OMIMX linked to G?

If we have to answer one of the question with no, we are loosing information.

Just my two cents…

Seb

On 18 Mar 2015, at 17:13, Chris Mungall notifications@github.com wrote:

Or if we really want to say 'mutations in this gene cause this phenotype' then we should use NCBIGene

On 18 Mar 2015, at 7:01, Melissa Haendel wrote:

@pnrobinson @drseb should these be migrated to a disease class? for OMIM:516000 possibly LEBER OPTIC ATROPHY?

for OMIM107730 perhaps HYPOBETALIPOPROTEINEMIA? Even if OMIM has genes and diseases mixed up doesn't mean we have to annotate to them, especially as many of these terms could come from DO, orphanet, or DC. We should not annotate phenotypes to genes methinks.


Reply to this email directly or view it on GitHub: https://github.com/monarch-initiative/human-disease-ontology/issues/14#issuecomment-82986929 — Reply to this email directly or view it on GitHub.

cmungall commented 9 years ago

From @sbello on March 27, 2015 14:46

The entry 535000 is a gene (* prefix) record in OMIM, all of the related phenotypes have separate OMIM records. At MGI we exclude all * records from our disease load. Sue

cmungall commented 9 years ago

From @drseb on March 27, 2015 15:26

The entry 535000 is a gene (* prefix) record in OMIM, all of the related phenotypes have separate OMIM records. At MGI we exclude all * records from our disease load.

I assume you mean 516000.

This OMIM entry has HPO annotations and is linked to ND1 (or MT-ND1, entrez 4535) I usually use OMIM genemap, mim2gene, and Orphanet to link between diseases and genes or map between OMIM-gene-entry and Entrez-gene.

I do not see any other OMIM entry linked to ND1 in genemap. I don’t see any Orphanet-entry WITH phenotype-data linked to that gene.

So the links between that gene and the currently annotated HP-terms would be lost IMHO.

Seb

cmungall commented 9 years ago

From @sbello on March 27, 2015 15:46

Yes, sorry about that 516000.

cmungall commented 9 years ago

From @nlwashington on March 27, 2015 23:30

note the following omim entries have annotations from the HPO group (just 25), that are the "combined" gene and phenotype, but what it really means it is a genomic location that happens to have some phenotypes associated with it: 100650,107680,107730,107741,109270,114835,116790,124060,132810,138300,141800,141900,147892,151430,152200,152780,159555,168820,173470,177400,182870,211100,222745,309850,314200 perhaps these should just be migrated to the relevant disease.

some are easy to map (1:1); others are not.

100650 --> 610251 107680 --> 105200 or 604091 (but there are also other disease/phenotypes here that don't have omim ids, "ApoA-I and apoC-III deficiency, combined" and "Corneal clouding, autosomal recessive" 107730 --> 144010, 615558 107741 --> 104310, 611771, 269600, 603075 (plus Hyperlipoproteinemia, type III and {Myocardial infarction susceptibility} ) 109270 --> A LOT OF THINGS.

cmungall commented 9 years ago

From @nlwashington on March 27, 2015 23:35

i have also opened an issue in the hpo tracker here: https://sourceforge.net/p/obo/human-phenotype-requests/438/

cmungall commented 9 years ago

From @nlwashington on June 29, 2015 19:17

OMIM:124060 is also a gene causing all kinds of trouble.

cmungall commented 9 years ago

From @nlwashington on June 29, 2015 22:0

and then there are thinks like OMIM:601894 which are very clearly just diseases, but end up getting typed as genes somewhere in the pipeline. i've checked our code and output ttl, and this typing is not coming from the data, but must be from the ontologies.

cmungall commented 9 years ago

I will:

  1. using Monarch's omim.ttl to generate a 'blacklist' of genes (can be done by the SO class)
  2. subtract these from any downstream application of omimclusters.obo
pnrobinson commented 9 years ago

516000 is a gene that is associated with a bunch of diseases. The webpage for this enmtry does not list the diseases at the top (inconsistently with mist omim entries), and instead we only see the ALLELIC VARIANTS at the bottom. I understand that OMIM has historically treated mitochondrial genes differently. We in fact might want to look at the mitochondrion in a separate project, I think there are a lot of problems with the databases. -Peter

Dr. med. Peter N. Robinson, MSc. Professor of Medical Genomics Professor in the Bioinformatics Division of the Department of Mathematics and Computer Science of the Freie Universität Berlin Institut für Medizinische Genetik und Humangenetik Charité - Universitätsmedizin Berlin Augustenburger Platz 1 13353 Berlin Germany +4930 450566006 Mobile: 0160 93769872 peter.robinson@charite.de http://compbio.charite.de http://www.human-phenotype-ontology.org Introduction to Bio-Ontologies: http://www.crcpress.com/product/isbn/9781439836651 I have learned from my mistakes, and I am sure I can repeat them exactly ORCID ID:http://orcid.org/0000-0002-0736-9199 Scopus Author ID 7403719646 Appointment request: http://doodle.com/pnrobinson


Von: Chris Mungall [notifications@github.com] Gesendet: Samstag, 25. Juli 2015 01:55 An: monarch-initiative/monarch-disease-ontology Cc: Robinson, Peter Betreff: Re: [monarch-disease-ontology] genes are getting into the disease file (#14)

From @mellybellyhttps://github.com/mellybelly on March 18, 2015 14:1

@pnrobinsonhttps://github.com/pnrobinson @drsebhttps://github.com/drseb should these be migrated to a disease class? for OMIM:516000 possibly LEBER OPTIC ATROPHY?

for OMIM107730 perhaps HYPOBETALIPOPROTEINEMIA? Even if OMIM has genes and diseases mixed up doesn't mean we have to annotate to them, especially as many of these terms could come from DO, orphanet, or DC. We should not annotate phenotypes to genes methinks.

— Reply to this email directly or view it on GitHubhttps://github.com/monarch-initiative/monarch-disease-ontology/issues/14#issuecomment-124770427.

pnrobinson commented 9 years ago

I have removed OMIM:516000, it is a gene entry, and we have the corresponding pheno entries. I think this was historical, i.e., some years back there was just this entry, and now OMIM has created the corresponding phenotype entries -PEter

Dr. med. Peter N. Robinson, MSc. Professor of Medical Genomics Professor in the Bioinformatics Division of the Department of Mathematics and Computer Science of the Freie Universität Berlin Institut für Medizinische Genetik und Humangenetik Charité - Universitätsmedizin Berlin Augustenburger Platz 1 13353 Berlin Germany +4930 450566006 Mobile: 0160 93769872 peter.robinson@charite.de http://compbio.charite.de http://www.human-phenotype-ontology.org Introduction to Bio-Ontologies: http://www.crcpress.com/product/isbn/9781439836651 I have learned from my mistakes, and I am sure I can repeat them exactly ORCID ID:http://orcid.org/0000-0002-0736-9199 Scopus Author ID 7403719646 Appointment request: http://doodle.com/pnrobinson


Von: Chris Mungall [notifications@github.com] Gesendet: Samstag, 25. Juli 2015 01:55 An: monarch-initiative/monarch-disease-ontology Betreff: Re: [monarch-disease-ontology] genes are getting into the disease file (#14)

OK, we we have entries such as OMIM:107730, which are apparently combined G+P entries that may be split in the future (we also have HPO annotations for OMIM:107730 in Monarch).

Then we also have cases like this one: http://monarchinitiative.org/disease/OMIM:516000 'Complex I, Subunit Nd1' which seems much more in the gene camp For which MGI (sensibly) excludes from omimclusters. But note above, we do have phenotype data for.

How should we treat these in mondo? I think we need to at least provide a label. But even a classification under 'disease' is potentially confusing.

— Reply to this email directly or view it on GitHubhttps://github.com/monarch-initiative/monarch-disease-ontology/issues/14#issuecomment-124770426.