Closed matentzn closed 2 years ago
Alright, so I looked into this, and I can obtain these mappings from one or both of two places:
i. OMIM's mim2gene.txt
(we've been using this in the ingest so far; has a column for HGNC mappings, but have not thus far utilized these in the generated omim.ttl
)
ii. OMIM's genemap2.txt
(haven't used in the ingest yet, but has a column for HGNC mappings)
I plan to use both of these and, if I notice any inconsistencies in these mappings between the two files, I will make an error report.
Questions/Issues
https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/HGNC:
(example: https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/HGNC:404), and (ii) https://www.alliancegenome.org/gene/HGNC:
(example: https://www.alliancegenome.org/gene/HGNC:404)ALDH2
has ID HGNC:404
. If I need to go with (b), I'll need to figure out how I'm going to effectively map between these symbols and IDs, using web scraping as a last resort. If you have any recommendations, let me know.Sounds great.
https://identifiers.org/hgnc:16793
so feel free to use that for now. I may change my mind. :Pkg-hub-n-data
group in the Monarch slack space?Thanks!
(1) Ok, I'll use https://identifiers.org/hgnc:
as my CURIE prefix for HGNC. I can also do a quick PR to add it to https://github.com/monarch-initiative/mondo/blob/master/src/ontology/metadata/mondo.sssom.config.yml
(2) Sure thing
(3) Fine w/ me. But if it is recommended that I use the IDs instead of the symbols, I'll need to find some way to map them.
Just in case I forgot to mention, it would be create to see the omim-hgnc.monarch.sssom.tsv
file attached to the release: https://github.com/monarch-initiative/omim/releases/tag/latest
You did mention that, and it's on my task list. But I did just think of a question, now that you mention this (I'll ask in today's meeting):
UMLS-OMOM && HGNC-OMIM Mapping files: which mappings?
Results in latest release. I think I still need to split the file, though: https://github.com/monarch-initiative/omim/releases/tag/latest
We need a complete mapping (covering all genes, not just the genes in omim.ttl in case these are restricted to the disease relevant ones) of all HGNC-OMIM genes. We should output an sssom file and attach it to the release, the same way as omim.ttl is attached.