monarch-initiative / omim

Data ingest pipeline for OMIM.
6 stars 2 forks source link

`morbidmap.txt`: Cases where same MIM# is mapped to (i) a phenotype MIM# and (ii) a plain label #86

Open joeflack4 opened 1 year ago

joeflack4 commented 1 year ago

Overview

Hey @sabrinatoro , I wanted you to take a look at this for me. While I was coding, I remembered something you said and I wanted to check for an edge case. I believe you mentioned at the meeting yesterday that, in rows where the Phenotype column has no MIM# in it, the MIM# in the MIM Number column is not actually a gene MIM (which is usually the case), but is a phenotype/disorder MIM. I hope I got that correct.

Specific case

Here's an example of such a row (OMIM:100650):

Phenotype   Gene Symbols    MIM Number  Cyto Location
{Esophageal cancer, alcohol-related, susceptibility to} (3) ALDH2   100650  12q24.12

However, for OMIM:100650 there are also 2 rows where it is mapped to a Phenotype field with no MIM# inside of it. Here's all the rows where OMIM:100650 appears in the MIM Number column:

Phenotype   Gene Symbols    MIM Number  Cyto Location
Alcohol sensitivity, acute, 610251 (3)  ALDH2   100650  12q24.12
{Hangover, susceptibility to}, 610251 (3)   ALDH2   100650  12q24.12
{Esophageal cancer, alcohol-related, susceptibility to} (3) ALDH2   100650  12q24.12
{Sublingual nitroglycerin, susceptibility to poor response to} (3)  ALDH2   100650  12q24.12

Question

I think this conflicts with the hypothesis I mentioned in "Overview". So I'm wondering which is true: (a) In cases where there is no MIM# in Phenotype field, the MIM# in the MIM Number field will always be a phenotype/disorder, and not a gene, (b) same thing as a, but replace the word 'always' with 'sometimes', (c) the MIM# in the MIM Number field is indeed always a gene.

Data

I found all such similar cases and created a TSV: morbidmap cases of mim mapped to both another mim and a plain label.tsv.zip