Closed kltm closed 5 years ago
@vanaukenk @ukemi Would either of you know the best place or person to contact for this item?
@deustp01 ?
I think I see how we did this - it looks like an error in parsing our catalyst instances to generate molecular function lines for the GAF we submit to GO that we recently fixed. I will check this and let you know.
Checking complete - this annotation was due to an error in how we generated molecular function annotations when the active entity is one subunit of a larger complex. There are probably other similar errors; all should be fixed when our patched GAF generation script runs on our next release in a few weeks.
@deustp01 Great--thank you so much for your help.
@deustp01 I still see this annotation, shouldn't it have been removed months ago?: http://amigo.geneontology.org/amigo/gene_product/UniProtKB:P08700
IL3 | Interleukin-3 | | protein tyrosine kinase activity | | Reactome | Homo sapiens | TAS | | | protein | | Reactome:R-HSA-879909 | 20100714
Indeed, should have been fixed. We are checking to see what went wrong. Peter
We think we've found the gap in our filtering that let this one through - should be fixed in time for our December 2018 release.
@deustp01 Thanks for looking into this!
@kltm Any idea why this is still in Amigo? I can't find this annotation in release or snapshot, but it's still here:
http://amigo.geneontology.org/amigo/gene_product/UniProtKB:P08700
Also, the release reactome.gaf has 27 lines for P08700 IL3_HUMAN but there are 29 Reactome annotations for this protein in Amigo. other annot in Amigo not in the GAFs I just downloaded is
Ras guanyl-nucleotide exchange factor activity ... Reactome:R-HSA-5672965 | 20150206
Reactome is double-ingested, natively and via Reactome->GOA->GOC
$ curl -L -s http://release.geneontology.org/2019-01-01/products/annotations/goa_human-src.gaf.gz | gzip -dc | grep P08700 | grep 20150206
UniProtKB P08700 IL3 GO:0005088 Reactome:R-HSA-5672965 TAS F Interleukin-3 IL3 protein taxon:9606 20150206 Reactome
We need to decide - do we remove reactome from the pipeline and assume we get these from goa, or do we filter from goa? can discuss next manager call
Added to Manager_Call_2019-01-23 agenda.
@alexsign - can you confirm what GOA does here - does it drop and reload Reactome each release?
@cmungall - Hi Chris, yes, the GOA database drops and re-upload annotations from external sources on a daily bases. For the Reactome we downloading gene_association.reactome.gz from here:
ftp://ftp.geneontology.org/pub/go/gene-associations/submission/
In the file downloaded last night: UniProtKB P08700 IL3_HUMAN GO:0005576 REACTOME:R-HSA-5672965 TAS C protein taxon:9606 20130516 Reactome UniProtKB P08700 IL3_HUMAN GO:0005576 REACTOME:R-HSA-879909 TAS C protein taxon:9606 20130516 Reactome
The GOA database data released monthly on our FTP and updated weekly in QuickGO.
There's a known scheduling feature here. The file that GOA downloads is computed from our released content and that content only changes once every three months when we issue a new release, so the content has not changed since December 13, 2018 when our last release went live and will not change again until March 20, 2019 when our next release is scheduled to go live.
However, we were supposed to fix the known error in our GAF generation script that is causing this mis-annotation in plenty of time to get a correct annotation into the December 2018 file, so I will need to find out what went wrong, again! This is getting embarrassing. Apologies.
Before I interrogate the Reactome people responsible for the GAF file to find out why the problem is continuing and whether we can finally fix it, can someone tell me specifically what the current complaint is. Are you upset that we've associated protein P08700 with cell_component GO:0005576 (extracellular region), in Alex's comment from 20 minutes ago, or that it's associated with GO:0005088 (Ras guanyl-nucleotide exchange factor activity) (Chris's comment from 6 days ago) or both?
The annotation to GO:0005088 (Ras guanyl-nucleotide exchange factor activity) is a known error that should have been fixed a long time ago; the annotation to cell_component GO:0005576 (extracellular region) is a new issue.
Thanks
@deustp01 I don't think this ticket has anything for Reactome to do currently- I think this is now about how the GOA/Reactome GAFs are treated by GOC. No action needed from you at this point, it looks like the originally questioned annotation is no longer present in Reactome's GAF -we'll tag you if we need anything. Thank you!
Great! I mis-read and thought we were still generating the old wrong annotations.
I see this annotation is now gone, and the Reactome discussion has been mentioned in Managers calls. Closing.
Initially @jpinero from on #95 :
I was wondering if the annotation of:
IL3 | Interleukin-3 | | protein tyrosine kinase activity | | Reactome | Homo sapiens | TAS
is correct.