geneontology / go-annotation

This repository hosts the tracker for issues pertaining to GO annotations.
BSD 3-Clause "New" or "Revised" License
34 stars 10 forks source link

Change annotations using 'has_direct_input' to 'has_input' extensions #2582

Open pgaudet opened 5 years ago

pgaudet commented 5 years ago

Hello,

The GO-CAM specifications use only 'has_input'; not 'has_direct_input'.

We will get the annotations updated in Protein2GO; for groups not using this extension, please update your annotations.

Impacted groups:

AgBase ARUK-UCL BHF-UCL CAFA dictyBase FlyBase GO_Central HGNC MGI ParkinsonsUK-UCL PomBase SGD UniProt WB

AFAIK only MGI and pombase do not use Protein2GO in this list. https://docs.google.com/spreadsheets/d/1cOaBQrdW4fYfc3vVQlPWvAfFLsP_ItiPnqgKtvahUtU/edit#gid=645618912

Thanks, Pascale

ukemi commented 5 years ago

MGI done.

pgaudet commented 4 years ago

Hello,

In QuickGO there are still about 150 annotations using 'has direct input': https://docs.google.com/spreadsheets/d/1FpQY9u6kCgAKFcr79a1i_inq8vLXV5zZWGkZ7VCE5ow/edit#gid=0

Assigned by  
ARUK-UCL 2 @RLovering
FlyBase 5 @hattrill
GO_Central 1 /fixed)
pombase via GOC-OWL 16 @ValWood Can you look at those ?
MGI 106 @ukemi
SGD 28 @srengel

@alexsign Can you update UCL, FlyBase and SGD annotations ?

Thanks, Pascale

hattrill commented 4 years ago

I quickly checked and fixed ours.

alexsign commented 4 years ago

@pgaudet ARUK-UCL and SGD is done. still in the database: MGI 106 GOOI 16

ukemi commented 4 years ago

This is redundant. @dustine32 will make this conversion as part of the import.

mah11 commented 4 years ago

In light of @ukemi's comment, we won't worry about it for this ticket, but for future reference:

alexsign commented 4 years ago

@ukemi will they disappear from http://www.informatics.jax.org/downloads/reports/mgi.gpa ?

ukemi commented 4 years ago

Only after we finish with the round tripping, but it made sense to us that in cases like this, rather than go in and hand-edit all of our annotations, we would just fix it computationally when the pipeline is in place. @vanaukenk, please correct me if I am mistaken.

mah11 commented 4 years ago

What does "pombase via GOC-OWL" mean?

it is data extracted from GO-CAM models converted to gpad.

In that case, the annotations are not assigned by PomBase.

alexsign commented 4 years ago

@mah11 Please ignore my previous comments I was looking at the different pipeline. Actual data is coming form inferences pipeline.

mah11 commented 4 years ago

@alexsign - Ah, OK; tho' it still looks like something we can't address directly, since the annotations we submit don't use has_direct_input.

alexsign commented 4 years ago

@mah11 the file I'm getting is form here: http://build.berkeleybop.org/view/GAF/job/gaf-check-pombase/lastSuccessfulBuild/artifact/gene_association.pombase.inf.gaf

mah11 commented 4 years ago

Thanks, @alexsign. I guess we'll have to let the relevant people improve that inference pipeline.

pgaudet commented 4 years ago

@mah11 Those come from the ontology. But it's so weird - the inferences state that the process is part of the same process:

UniProt ID Gene name GOID label Assigned by extension
O13286 srw1 GO:1905786 positive regulation of anaphase-promoting complex-dependent catabolic process GOC-OWL happens_during(GO:0000080),has_direct_input(GO:0005680),part_of(GO:1905786)
Q9Y703 alp31 GO:0007023 post-chaperonin tubulin folding pathway GOC-OWL has_direct_input(PomBase:SPBC26H8.07c),part_of(GO:0007023)
O74476 sal3 GO:0006606 protein import into nucleus GOC-OWL has_direct_input(PomBase:SPAC24H6.05),part_of(GO:0006606)
P32587 pyp3 GO:0010971 positive regulation of G2/M transition of mitotic cell cycle GOC-OWL has_direct_input(PomBase:SPBC11B10.09),part_of(GO:0010971)
P14068 xpo1 GO:0006611 protein export from nucleus GOC-OWL happens_during(GO:0071280),has_direct_input(PomBase:SPAC31A2.11c),part_of(GO:0006611)
Q10164 rga2 GO:0035024 negative regulation of Rho protein signal transduction GOC-OWL has_direct_input(PomBase:SPAC16.01),part_of(GO:0035024)
O13924 hrk1 GO:0072356 chromosome passenger complex localization to kinetochore GOC-OWL has_direct_input(PomBase:SPBC1105.11c),part_of(GO:0072356)
P46595 ubc4 GO:0006511 ubiquitin-dependent protein catabolic process GOC-OWL has_direct_input(PomBase:SPBC2A9.04c),part_of(GO:0006511)
Q00619 mam2 GO:0000750 pheromone-dependent signal transduction involved in conjugation with cellular fusion GOC-OWL has_direct_input(PomBase:SPCC1795.06),part_of(GO:0000750)
O94537 ppk4 GO:0036498 IRE1-mediated unfolded protein response GOC-OWL has_direct_input(PomBase:SPAC22A12.15c),part_of(GO:0036498)
Q10156 lkh1 GO:2000134 negative regulation of G1/S transition of mitotic cell cycle GOC-OWL has_direct_input(PomBase:SPBC32F12.09),part_of(GO:2000134)
Q09763 gef1 GO:2000784 positive regulation of establishment of cell polarity regulating cell shape GOC-OWL has_direct_input(PomBase:SPAC110.03),part_of(GO:2000784)
O60175 SPBC21H7.06c GO:0006606 protein import into nucleus GOC-OWL has_direct_input(PomBase:SPCC1739.13),part_of(GO:0006606)
P04551 cdc2 GO:0031031 positive regulation of septation initiation signaling GOC-OWL has_direct_input(PomBase:SPAC222.10c),part_of(GO:0031031)
Q9Y7J8 SPBC216.01c GO:0045875 negative regulation of sister chromatid cohesion GOC-OWL has_direct_input(PomBase:SPBC26H8.05c),part_of(GO:0045875)
Q9HGN5 SPBC36B7.05c GO:0071629 cytoplasm protein quality control by the ubiquitin-proteasome system GOC-OWL has_direct_input(PomBase:SPAC17G8.12),part_of(GO:0071629)

@cmungall @kltm this seems like a bug ???

mah11 commented 4 years ago

From the first few, it looks like our manually curated extensions are being copied over unchanged, even when one of the extensions uses the same GO ID/term as the inferred annotation. E.g. we have these annotations (minimal columns shown):

Gene name GOID extension
srw1 GO:1990757 happens_during(GO:0000080),has_input(GO:0005680),part_of(GO:1905786)
alp31 GO:0044183 has_input(PomBase:SPBC26H8.07c),part_of(GO:0007023)
sal3 GO:0061608 has_input(PomBase:SPAC24H6.05),part_of(GO:0006606)

So I'm in the "it's a bug" camp ...