Closed joeflack4 closed 1 year ago
@sabrinatoro Here's the full report of all such cases: noMimNumsInPhenoLabels.tsv.zip
I think for this issue, I have a solution to work on (b in 'possible solutions'). For your role, I think it is just analyzing this list and reporting back at the meeting?
And here are some example rows:
Phenotype Gene Symbols MIM Number Cyto Location
3p- syndrome (4) "DEL3pterp25, C3DELpterp25" 613792 3pter-p25
46XX sex reversal 2 (4) "SRXX2, DUP17q24.3" 278850 17q24.3-q25.1
"?Amelogenesis imperfecta, type IE, X-linked 2 (2)" "AI1E2, AIH3" 301201 Xq22-q28
"?Antiphospholipid syndrome, familial (2)" ATPLS 107320 6p21.3
?Craniofacioskeletal syndrome (2) CFSS 300712 Xq26-q27
I did an investigation for #76 to look at how Exomiser does. It looks like if MIM number is missing, they don't add a relationship: https://github.com/monarch-initiative/omim/issues/76#issuecomment-1319284653
Here's the full report of all such cases: noMimNumsInPhenoLabels.tsv.zip
I am a bit confused: all the phenotypes in the list above have a OMIM id in a separate column. Could you show other line of the morbit map? maybe we are missing omim number for the gene (and not the phenotype)?
Just updating this issue given that Sabrina already figured this out. I explained in related thread: https://github.com/monarch-initiative/omim/issues/76#issuecomment-1320616745
Overview
We are mapping gene::disease associations from
morbidmap.txt
. All of the genes have MIM#s (see theMIM Number
field). The MIM# for the 'diseases' are found within the string label of thePhenotype
field. However, in some cases there is no MIM# there.Possible solutions.
morbidmap.txt
(so that the associations themselves have IDs), and an alternative set of IDs for all of the entries inPhenotype
. This would be much more complex though and would deviate further from OMIM's data model.Phenotype
.At the meeting on 2022/11/11, we opted for (b). (a) is just something I thought up and not a fully formed idea.
Questions
?Thrombophilia 9 due to decreased release of tissue plasminogen (1)
-->?Thrombophilia 9 due to decreased release of tissue plasminogen
. I am assuming yes for now.Examples
All but the last two rows here are examples where there is no MIM# for
Phenotype
. I included the last two rows just in case it helps to compare to the thyroid carcinoma rows that have no MIM#.Related
77
76