monarch-initiative / omim

Data ingest pipeline for OMIM.
7 stars 3 forks source link

genes missing in the omim.ttl #58

Closed sabrinatoro closed 1 year ago

sabrinatoro commented 2 years ago

I cannot see find any gene-disease annotations in the omim.ttl file. Are they missing? or are they coming from another file? (if so, could you please point me to it?) Thanks!

sabrinatoro commented 2 years ago

@joeflack4 (I cannot assign you on the issue, so I am tagging you here)

joeflack4 commented 2 years ago

Hey @sabrinatoro ~. I added you as a maintainer of the repository, so you should be able to assign issues to me now.

Hmm, that seems like something I should definitely fix. Do you know if this was an issue for previous releases of omim.ttl as well, or just the latest release?

I just recently added genetic mappings for HGNC (symbols and IDs), which were not there in the previous release. But at the moment (I don't have time to look today, but I can look tomorrow or Friday), I don't recall which other specific gene or disease mappings have been included.

I know that I'm also going to be adding some more gene mappings soon; Some of them is an "Entrez ID", and "Ensemble ID" and maybe some more.

matentzn commented 2 years ago

Take a cursory look in the next weeks @joeflack4 but do not go into deep without reporting to us in a meeting, as there is something in the back of my mind that tells me we decided to deliberately not include d2gs, but not sure now.

joeflack4 commented 1 year ago

@sabrinatoro Here is a link to the new release where you can download the latest omim.ttl: https://github.com/monarch-initiative/omim/releases/tag/2022-10-27

Can you also check out monarch-initiative/mondo#75? I added some example snippets where I show how I'm adding gene information and this other mapping evidence. I'm hoping I've done this correctly.

sabrinatoro commented 1 year ago

@joeflack4, please see my comment on https://github.com/monarch-initiative/omim/issues/75 I can see the genes in the omim.ttl, but there are some errors in the cases when more than 1 gene is associated with a phenotype/disease (these gene-disease association should not be imported)

sabrinatoro commented 1 year ago

I spot checked some gene-disease from the list in this issue: https://github.com/monarch-initiative/monarch-ingest/issues/429, and was able to find them.

joeflack4 commented 1 year ago

I see the point. The problem is that the way that this was originally written, it was naiive and did not expect multiple disease/gene associations.

I'm looking at this now and will bring up on Slack as this will be easier with some back and forth.

Screen Shot 2022-10-31 at 2 37 28 PM