geneontology / go-annotation

This repository hosts the tracker for issues pertaining to GO annotations.
BSD 3-Clause "New" or "Revised" License
35 stars 10 forks source link

ARBA:00026354 and ARBA00028243 and ARBA00028932 (uninformative high level processes, also incorrect) #4115

Closed ValWood closed 1 year ago

ValWood commented 2 years ago

I would not make either of these annotations for ksg1 a 3-phosphoinositide-dependent protein kinase

GO:1901564 | organonitrogen compound metabolic process | IEA with ARBA00026354 | GO_REF:0000117 | 1624 GO:0044238 | primary metabolic process | IEA with ARBA00028243 | GO_REF:0000117

They are too high level and it isn't really clear where they come from.

They. seem to be generated from some machine learning method that is applied to uncurated entries but this type of annotation doesn't appear to be useful for poorly studied or well-studied prototeomes.

These are the first ones I came across so I will check some others but these should be removed.

@pgaudet @Antonialock do you know anything about these? It would be useful if everyone was informed about new methods so they could review.

ValWood commented 2 years ago

~Also block for GO:0043170 | macromolecule metabolic process | IEA with ARBA00026955 It might be true (over 50% of pombe proteins are) but I would rather keep as "unknown"~

ValWood commented 2 years ago

Block for direct annotation GO:0071704 | organic substance metabolic process | IEA with ARBA00028932 (1 manual annotation FB) Action: merge with parent

Action: block for direct annotation GO:0043170 | macromolecule metabolic process | IEA with ARBA00026955 (1 direct IEP) Has only 23 IEA annotation 16 from ARBA

GO:1901564 | organonitrogen compound metabolic process

GO:0044238 | primary metabolic process 67 annotation (1 EXP, 66 IEA, mainly from ARBA)

ValWood commented 2 years ago

More terms that will be flagged as "do not annotate/not for direct annotation"

GO:0043231 intracellular membrane-bounded organelle (is also not clear that this was correct for ucp12)

GO:0110165 cellular anatomical entity (is also not clear that this was correct for dpp2 which is a protease, this would usually be acting on an anatomical entity rather than part of it). Note that the definition is of "anatomical entity" is "with granularity above the level of a protein complex"

PedroRaposo commented 1 year ago

Hello, Protein ksg1 (https://www.pombase.org/gene/SPCC576.15c), or Q12701 (https://www.uniprot.org/uniprotkb/Q12701/entry) does no longer have the 2 GO terms GO:1901564 (organonitrogen compound metabolic process) nor GO:0044238 (primary metabolic process). You can verify in QuickGO (https://www.ebi.ac.uk/QuickGO/annotations?geneProductId=Q12701).

Using the same resources, I can see that ucp12 does no longer have the GO term GO:0043231 (https://www.ebi.ac.uk/QuickGO/annotations?geneProductId=O94536).

And dpp2 does not have GO:0110165 (https://www.ebi.ac.uk/QuickGO/annotations?geneProductId=O14073).