Closed sierra-moxon closed 9 months ago
added this check via PR: https://github.com/geneontology/gopreprocess/pull/7
Hi @sierra-moxon
My recent reviews lead me to believe that you have done this?
@LiNiMGI I believe this is ready for the MGI side?
if @sierra-moxon has done this, this one can be closed.
thanks @LiNiMGI!
First round of QC on the newly GOC generated human->mouse GAF file via orthology: https://drive.google.com/drive/folders/1uICd7pxqre6hwtKNnMV5NKTPfcgR9xGy
96K matches between new GOC file and MGI generated file straight away. Some were reported in the new GOC file that were rejected from MGI file because there is not a 1:1 correspondence of the UniProt identifier to an MGI marker and this is a process annotation.
For example:
the UniProt identifier is associated with 2 markers in MGI, H2-Eb1 (MGI:95901) and H2-Eb2 (MGI:95902). Since paralogs in an organism often are involved in different processes, MGI conservatively doesn't make the association of any gene that maps to more than one mouse gene for biological process.
We need to add this logic to the code at GOC.