geneontology / helpdesk

The Gene Ontology Helpdesk
http://help.geneontology.org
16 stars 6 forks source link

IL3 annotation as protein tyrosine kinase activity #97

Closed kltm closed 5 years ago

kltm commented 6 years ago

Initially @jpinero from on #95 :

I was wondering if the annotation of:

IL3 | Interleukin-3 | | protein tyrosine kinase activity | | Reactome | Homo sapiens | TAS

is correct.

kltm commented 6 years ago

@vanaukenk @ukemi Would either of you know the best place or person to contact for this item?

ukemi commented 6 years ago

@deustp01 ?

deustp01 commented 6 years ago

I think I see how we did this - it looks like an error in parsing our catalyst instances to generate molecular function lines for the GAF we submit to GO that we recently fixed. I will check this and let you know.

deustp01 commented 6 years ago

Checking complete - this annotation was due to an error in how we generated molecular function annotations when the active entity is one subunit of a larger complex. There are probably other similar errors; all should be fixed when our patched GAF generation script runs on our next release in a few weeks.

kltm commented 6 years ago

@deustp01 Great--thank you so much for your help.

suzialeksander commented 5 years ago

@deustp01 I still see this annotation, shouldn't it have been removed months ago?: http://amigo.geneontology.org/amigo/gene_product/UniProtKB:P08700

IL3 | Interleukin-3 |   | protein tyrosine kinase activity |   | Reactome | Homo sapiens | TAS |   |   | protein |   | Reactome:R-HSA-879909 | 20100714

deustp01 commented 5 years ago

Indeed, should have been fixed. We are checking to see what went wrong. Peter

deustp01 commented 5 years ago

We think we've found the gap in our filtering that let this one through - should be fixed in time for our December 2018 release.

suzialeksander commented 5 years ago

@deustp01 Thanks for looking into this!

suzialeksander commented 5 years ago

@kltm Any idea why this is still in Amigo? I can't find this annotation in release or snapshot, but it's still here:

http://amigo.geneontology.org/amigo/gene_product/UniProtKB:P08700

Also, the release reactome.gaf has 27 lines for P08700 IL3_HUMAN but there are 29 Reactome annotations for this protein in Amigo. other annot in Amigo not in the GAFs I just downloaded is

Ras guanyl-nucleotide exchange factor activity ... Reactome:R-HSA-5672965 | 20150206

cmungall commented 5 years ago

Reactome is double-ingested, natively and via Reactome->GOA->GOC

$ curl -L -s http://release.geneontology.org/2019-01-01/products/annotations/goa_human-src.gaf.gz | gzip -dc | grep P08700 | grep 20150206 
UniProtKB       P08700  IL3             GO:0005088      Reactome:R-HSA-5672965  TAS             F       Interleukin-3   IL3     protein taxon:9606  20150206 Reactome
cmungall commented 5 years ago

We need to decide - do we remove reactome from the pipeline and assume we get these from goa, or do we filter from goa? can discuss next manager call

suzialeksander commented 5 years ago

Added to Manager_Call_2019-01-23 agenda.

cmungall commented 5 years ago

@alexsign - can you confirm what GOA does here - does it drop and reload Reactome each release?

alexsign commented 5 years ago

@cmungall - Hi Chris, yes, the GOA database drops and re-upload annotations from external sources on a daily bases. For the Reactome we downloading gene_association.reactome.gz from here:

ftp://ftp.geneontology.org/pub/go/gene-associations/submission/

In the file downloaded last night: UniProtKB P08700 IL3_HUMAN GO:0005576 REACTOME:R-HSA-5672965 TAS C protein taxon:9606 20130516 Reactome UniProtKB P08700 IL3_HUMAN GO:0005576 REACTOME:R-HSA-879909 TAS C protein taxon:9606 20130516 Reactome

The GOA database data released monthly on our FTP and updated weekly in QuickGO.

deustp01 commented 5 years ago

There's a known scheduling feature here. The file that GOA downloads is computed from our released content and that content only changes once every three months when we issue a new release, so the content has not changed since December 13, 2018 when our last release went live and will not change again until March 20, 2019 when our next release is scheduled to go live.

However, we were supposed to fix the known error in our GAF generation script that is causing this mis-annotation in plenty of time to get a correct annotation into the December 2018 file, so I will need to find out what went wrong, again! This is getting embarrassing. Apologies.

deustp01 commented 5 years ago

Before I interrogate the Reactome people responsible for the GAF file to find out why the problem is continuing and whether we can finally fix it, can someone tell me specifically what the current complaint is. Are you upset that we've associated protein P08700 with cell_component GO:0005576 (extracellular region), in Alex's comment from 20 minutes ago, or that it's associated with GO:0005088 (Ras guanyl-nucleotide exchange factor activity) (Chris's comment from 6 days ago) or both?

The annotation to GO:0005088 (Ras guanyl-nucleotide exchange factor activity) is a known error that should have been fixed a long time ago; the annotation to cell_component GO:0005576 (extracellular region) is a new issue.

Thanks

suzialeksander commented 5 years ago

@deustp01 I don't think this ticket has anything for Reactome to do currently- I think this is now about how the GOA/Reactome GAFs are treated by GOC. No action needed from you at this point, it looks like the originally questioned annotation is no longer present in Reactome's GAF -we'll tag you if we need anything. Thank you!

deustp01 commented 5 years ago

Great! I mis-read and thought we were still generating the old wrong annotations.

suzialeksander commented 5 years ago

I see this annotation is now gone, and the Reactome discussion has been mentioned in Managers calls. Closing.