Open sabrinatoro opened 9 months ago
This PR https://github.com/monarch-initiative/omim/pull/107
Will add a new release artefact to omim ingest which contains all the MONDO->HGNC gene associations via
MONDO:Disease-exactMatch->OMIM:Disease--['has basis in germline mutation of']-->OMIM:Gene-->HGNC:Gene.
@twhetzel should maybe spend some time reviewing my choice of only including "evidence code 3" from morbidmap (I dont know exactly what that means, ask @joeflack4, but the evidence string is:
Evidence: (3) The molecular basis for the disorder is known; a mutation has been found in the gene."
To review cases like this @twhetzel and I are deploying omim.owl from the Mondo ingest on Monarch OLS. This way we can see a bit better what is going on.
Next steps:
BTW, we deployed the Mondo version of OMIM now here: https://ols.monarchinitiative.org/ontologies/omim/terms?iri=https%3A%2F%2Fomim.org%2Fentry%2F100100
@matentzn @twhetzel I don't know why Nico only included "evidence code 3", and I can't think of anything else I might know other than what comes from the comments section in morbidmap.txt
provided by OMIM:
1 - The disorder is placed on the map based on its association with a gene, but the underlying defect is not known. 2 - The disorder has been placed on the map by linkage or other statistical method; no mutation has been found. 3 - The molecular basis for the disorder is known; a mutation has been found in the gene. 4 - A contiguous gene deletion or duplication syndrome, multiple genes are deleted or duplicated causing the phenotype.
It seemed to me that only case 3 fulfills @sabrinatoro conditions above (definition of this ticket). Maybe I am wrong.
@matentzn Ah OK. I should've read the full ticket. Hmm, yes, I think only (3) meets all of @sabrinatoro's requirements.
3 - The molecular basis for the disorder is known; a mutation has been found in the gene. This one seems fine, although I am not sure if it guarantees there is a 1-1 relation between omim and gene.
Evidence codes for 1 and 2 may be relevant, but would need expert input on that. Agree that 4 is not relevant.
@matentzn where is it saying "disease-to-gene"? I saw https://github.com/monarch-initiative/omim/pull/107 and flipped mappings, but not sure if there is a file with that to look at. It does sound a bit odd, but I can see some arguments for doing it that way based on some of the existing RO relations.
@twhetzel You may find it useful to glance at this. When you sign up for OMIM data downloads, this is one of the main files (mim2gene.txt
). There is a "MIM Entry Type", and I think the ones we're interested should be "phenotype" and maybe "predominantly phenotypes" (maybe there's more). "Phenotype" being sometimes used interchangeably with "disease", especially in the OMIM (and OMIA, I assume) context.
mim2gene.tsv.zip (FYI its an old copy)
@matentzn do you need more information from anyone for this ticket?
Next step is: Curator review of
https://github.com/monarch-initiative/omim/releases/download/2024-03-24/mondo_genes.csv
I personally do not know exactly how to review this, but @sabrinatoro may be able to help. I would stick this in Google docs, then start looking at a few examples and taking notes.