geneontology / go-releases

Tasks and notes for monthly GO releases
0 stars 0 forks source link

ZFIN file 'broken' - Pascale to investigate to see if data is restored (IEAs and IBAs) Doug "We are stuck on the GOA GAF load because the file we get from GOA is broken so we are stuck with an old file until they can get us a new one." AI: wait for one release, check again ===>> After March 15th AI: Pascale: ask Doug for unmapped identifiers #27

Closed pgaudet closed 11 months ago

pgaudet commented 1 year ago

Emailed Doug after discovering that UniProt Danio annotations changed in July 2022 were still not propagated in the various pipelines. Doug's reply:

With regard to our pipeline, that annotation comes to us in our GOA GAF load. We are still working on updating our UniProt load and GOA GAF load, which depend on each other. We are making progress but haven't implemented the update to production yet. Once we do implement the update, we will pick up the updated annotation, but haven't done so quite yet. Sorry this is taking so long..it's a bit complex. We do have an updated GOA GAF file that seems largely correct. We've done some testing with it. Because the UniProt data set, our UniProt load and GOA GAF load are in flux it is hard to tell how the GOA GAF load will end up looking in ZFIN, but I'm pretty optimistic. Not sure when this will finally hit production, but we are much closer to finished than not. Best, Doug

pgaudet commented 1 year ago

Email to Doug 2023-10-05 Hi Doug,

How are you ?

I have a question about the ZFIN pipeline – how do you update manual annotations from UniProt? I see that the file GO central loads (gene_association2.2_automated_only.zfin) contains non-IEA annotations (as we discussed a few months back); however these are not up-to-date: for example this annotation

            ZFIN       ZDB-GENE-030131-7949                supt6h  involved_in         GO:0061086       PMID:23503590|ZFIN:ZDB-PUB-130405-14 IMP                        P             SPT6 homolog, histone chaperone and transcription elongation factor                pan|pandora|Spt6|wu:fj42h11  protein_coding_gene      taxon:7955          20130828            UniProt

was edited back in July 2022 via P2GO but that old annotation is still in your file. Is there any way your file can be updated? Are there still alignment issues with UniProt?


Doug:

With regard to our pipeline, that annotation comes to us in our GOA GAF load. We are still working on updating our UniProt load and GOA GAF load, which depend on each other. We are making progress but haven't implemented the update to production yet. Once we do implement the update, we will pick up the updated annotation, but haven't done so quite yet. Sorry this is taking so long..it's a bit complex. We do have an updated GOA GAF file that seems largely correct. We've done some testing with it. Because the UniProt data set, our UniProt load and GOA GAF load are in flux it is hard to tell how the GOA GAF load will end up looking in ZFIN, but I'm pretty optimistic. Not sure when this will finally hit production, but we are much closer to finished than not.


Doug 2023-05-16

We just met to discuss our load updates and it sounds like several data loads (NCBI, UniProt, and GOA GAF loads) are all actively being worked on over the next month. I expect the earliest we would release the updates to production would be about 6 weeks from now. So it's still going to be a little while but it is on top of the priority list currently.

I don't think having GO Central load the IEAs and non-ZFIN manual annotations would help us much at the moment. Actually, that will probably make work for us in the short term because we will then need to turn off our GOA GAF load and maybe make a new load or update an existing one to get those annotations from GO Central instead of GOA.

Best, Doug

pgaudet commented 1 year ago

Update Date: Thursday, July 6, 2023

Subject: Re: ZFIN pipeline Hi Pascale, Just a follow up. We are nearly ready to release our updated UniProt load. That means any IEA annotations we create locally will get an updated date stamp, and we will get a fresh GOA GAF load done as well so any problematic annotations we had because our GOA GAF load hadn't run in a long time will be fixed. The end is near. 😉 Doug

pgaudet commented 12 months ago

Emailed Doug to follow up.

pgaudet commented 11 months ago

The question was specifically about InterPro mappings, but now we have 63K mappings for Danio, so I suppose this data is restored.