Closed joeflack4 closed 1 week ago
@matentzn In the file https://github.com/monarch-initiative/mondo-ingest/blob/hgnc-template/src/ontology/external/mondo_genes.robot.tsv
for MONDO_0000208
I see the source
is the OMIM record. This seems a better option that what is currently in Mondo as the source, e.g. MONDO:mim2gene_medgen
.
However,
mondo-edit.obo
to make sure there are not any classes with the has material basis in germline mutation in
that have the old source annotation (MONDO:mim2gene_medgen
)?MONDO:mim2gene_medgen
I do not know what this MONDO:mim2gene_medgen source refers to. It looks like a way to say that we are getting this information from the omim-gene file that somehow involves medgen? @nicolevasilevsky do you remember anything about this?
I think it is ok to remove the MONDO:mim2gene_medgen sources and replace them with OMIM identifier. However, it would not hurt to keep something like MONDO:mim2gene as a source to indicate that this annotation was made via a specific pipeline (similar to the "MONDO:MEDGE" source on the UMLS x-ref- image below for illustration).
We would therefore have the source be:
MONDO:mim2gene_medgen is documented on the Entities page as "This indicates the gene relationship came from MedGen.". @joeflack4 can you remind me whether these mappings originally were from the MedGen mappings file? @sabrinatoro do you want a different annotation used for the source still given the definition and pending Joe's answer for the question above?
@sabrinatoro do you want a different annotation used for the source still given the definition and pending Joe's answer for the question above?
I feel like I don't have enough information to give a clear answer, but I will try. Where is the information coming from?
My GUESS is that we were using the gene annotation from medgen at one point and medgen got this gene to disease annotation from omim (ie the MONDO:mim2gene_medgen source). It makes sense that we would switch to getting this information directly from omim now.
, then there is no point in keeping it. I am assuming (again assuming, please someone confirm)
Looking at the Monarch omim repo, I see references to the OMIM API and this download OMIM page so I guess the data in this HGNC ROBOT template is only coming from OMIM. That's the first thing for @joeflack4 or @matentzn to confirm.
If the data in this HGNC ROBOT file is only from OMIM, then we can go with Sabrina's comment:
if we get the information directly from OMIM, then we can use something like "MONDO:OMIM" (or whatever source we have to say that something comes from OMIM; I don't have a strong opinion about what to name it, but I can make a name up).
(from Trish - MONDO:OMIM
fits the pattern I see for GARD and NORD so +1 from me)
The last thing that I do not know is where did the data (see example below) that is currently in Mondo with has material basis in germline mutation in
come from and more importantly do we need to do anything about it.
For example, MONDO:0000208
and 'has material basis in germline mutation in' some TRMT10A
with source MONDO:mim2gene_medgen
.
I don't know if this HGNC template data in this PR is in addition to or intended to replace the existing data and if both the data in this ROBOT template and the existing data are from the same source and therefore should have the same source annotation. @matentzn do you know the answer to this?
Addresses sub-tasks in:
Related:
Overview
Update
mondo_genes.csv
to be a proper ROBOT template, and ties into pipeline for externally managed content.Pre-merge checklist
Documentation
Was the documentation added/updated under
docs/
?QC
Was the full pipeline run before submitting this PR using
sh run.sh make build-mondo-ingest
on this branch (afterdocker pull obolibrary/odkfull:dev
), and no errors occurred?Build PR:
564
New Packages
Were any new Python packages added?
Were any other non-Python packages added?
PR Review and Conversations Resolved
Has the PR been sufficiently reviewed by at least 1 team member of the Mondo Technical team and all threads resolved?
CC: @souzadevinicius Thought this would be a good one for you to review