geneontology / go-annotation

This repository hosts the tracker for issues pertaining to GO annotations.
BSD 3-Clause "New" or "Revised" License
31 stars 10 forks source link

Review annotations to GO:0005844 polysome and GO:0042788 polysomal ribosome #4750

Open raymond91125 opened 9 months ago

raymond91125 commented 9 months ago

Dear all,

The proposal has been made to obsolete GO:0005844 polysome and GO:0042788 polysomal ribosome see https://github.com/geneontology/go-ontology/issues/24651

Experimental annotations that need to be reviewed are here: https://docs.google.com/spreadsheets/d/1c8n9SFpT2IUazGuPOJpJSWFRoXCJFay-4h_ztW2gRh4/edit#gid=0

Impacted groups: [LIST GROUPS + NUMBER OF ANNOTATIONS + ADD ASSIGNEES]  ~UCL 8~ DONE ~CAFA 2~ ComplexPortal 1 GeneDB 9 ~MGI 16~ DONE ~PomBase 8~ DONE RGD 3 ~SGD 32~ DONE ~TAIR 1~ DONE ~UniProt 60~ ~WB 7~


Mappings that need to be reviewed: None.

Thanks.

gthayman commented 9 months ago

RGD done.

LiNiMGI commented 9 months ago

MGI done.

Antonialock commented 9 months ago

cafa done

srengel commented 9 months ago

SGD done

vanaukenk commented 9 months ago

For the WB annotations, I'm not sure it makes sense to annotate to the suggested term, ribosome, as at least some of the proteins we had annotated to 'polysome' are mRNA modifying proteins that interact with the mRNA in the context of a polysome, i.e. after translational initiation.

For these proteins I could re-annotate to cytoplasm but it seems like we're losing biologically relevant information.

Any other suggestions?

Note that we also looked at this on the 2023-09-26 annotation call, but didn't come to a resolution.

ValWood commented 9 months ago

Sorry, long answer with multiple comments.

1. I think it is OK that gene products which are part_of, or interact with the translation machinery are only annotated to ‘cytoplasm’. This is the case for lots of translation components (tRNA syntheses, elongation factor kinases, methyltransferases etc etc)

2. Probably we didn’t explain very well why these items are going. The problem with the polysome terms are multiple. Firstly, it doesn’t really make sense to have a term like “polysomal l ribosome” unless all of the ribosomal proteins are annotated to it, or a term like “polysome” unless all gene products involved in the ribosome, and involved in translation elongation are annotated to it ,and this clearly isn’t the case (for example there are only 7 C. elegans annotations to polysome). Secondly these terms are problematic because they necessarily include a single mRNA molecule being translated in addition to the translation machinery (I’m not sure this can be modelled sensibly in GO). Thirdly they contain multiple elongating ribosomes (when GO terms should represent singular entities).

3. I had a quick look at the C. elegans annotation to polysome and 4 come from a single paper: https://pubmed.ncbi.nlm.nih.gov/25217583/

The main relevant parts are:

Here, we use the animal model Caenorhabditis elegans to investigate the global mechanisms of two germline-enriched cytoPAPs, GLD-2 and GLD-4, by combining polysome profiling with RNA sequencing. Our analyses suggest that GLD-2 activity mediates mRNA stability of many translationally repressed mRNAs. This correlates with a general shortening of long poly(A) tails in gld-2-compromised animals, suggesting that most if not all targets are stabilized via robust GLD-2-mediated polyadenylation. By contrast, only mild polyadenylation defects are found in gld-4-compromised animals and few mRNAs change in abundance. Interestingly, we detect a reduced number of polysomes in gld-4 mutants and GLD-4 protein co-sediments with polysomes, which together suggest that GLD-4 might stimulate or maintain translation directly. Our combined data show that distinct cytoPAPs employ different RNA-regulatory mechanisms to promote gene expression, offering new insights into translational activation of mRNAs.

For example: To further corroborate a potential role for either cytoPAP in the process of translation, we used sucrose gradient centrifugation to separate initiation from post-initiation ribonucleoprotein complexes and assessed a potential co-sedimentation of GLD-2 and GLD-4 with either fraction (Figure 6). To reveal the distribution of specific proteins across the gradient, we probed for cytoplasmic poly(A)-binding protein (PABPC), translation initiation factor 2α (eIF2α), both cytoPAPs and the GLD-4-specific cofactor, GLS-1. PABPC is part of initiation and post-initiation mRNA complexes while bound to the poly(A) tail (40).

The only mention I see of pab-1 and pab-2 seems to be: Here, we use the animal model Caenorhabditis elegans to investigate the global mechanisms of two germline-enriched cytoPAPs, GLD-2 and GLD-4, by combining polysome profiling with RNA sequencing. Our analyses suggest that GLD-2 activity mediates mRNA stability of many translationally repressed mRNAs. In C. elegans, two genes encode PABPC, pab-1 and pab-2 (41,42). eIF2α mediates the association of the initiator tRNA with the small ribosomal subunit and serves as a marker for translation initiation complexes (43). In C. elegans, eIF2α is encoded by Y37E3.10

However, GLD-4 is needed for efficient polysome formation and general mRNA translation. Moreover, the majority of GLD-2 targets were low in abundance in the polysome region to begin with, arguing that the use of more sensitive techniques, such as higher resolution sucrose gradients paired with ribosome footprinting (36), might reveal a potential role of GLD-2 in stimulating mRNA translation.

So here the authors appear to be using polysome profiling as a technique to uncover the way that the annotated proteins pab-1 pan-2 gas-4 and gas-1 are involved in regulating translation. The fractions seem to be used to analyse translational status- because polysomes do not from until after the first round of “scanning translation” (the processes that they dissect seem more important to capture from a GO perspective than the polysomal fraction)

However the only annotations from this publication are to polysome https://amigo.geneontology.org/amigo/reference/PMID:25217583 Probably the original curator thought the evidence was not strong enough to annotate to translation, and I would agree, but mainly based on information in other species.So in this case probably hey annotation to ‘polysome’ is not soy useful?

GLD-2 and 4 orthologs are part of the Tramp complex, this complex has polyadenylation activity , but it is a nuclear complex involved in mRNA surveillance: https://www.ebi.ac.uk/QuickGO/GTerm?id=GO:0031499

pan-1 and pan-2 seem to be orthologs of the major polyA binding protein which is involved in polyA tail shortening in the cytoplasm, so these are affecting mRNA stability and hence translation (although these genes are not the major focus of this publication)

I think I would be happy with the cytoplasmic annotation here, or drop the annotations and to see if better annotations can be made from later publications, or other species. Can you make phenotype annotation in situations like this at WormBase? The phenotypes form this publication might be valuable if they are not considered convincing enough for GO annotations.

ValWood commented 9 months ago

Also @raymond91125 "ribosome" should probably only be a replacement for "polysomal ribosome" but not for "polysome"?

Antonialock commented 8 months ago

can someone update the documentation on using located_in +CC protein-containing complex? https://wiki.geneontology.org/Located_in

pgaudet commented 8 months ago

@Antonialock Is this better? https://wiki.geneontology.org/Located_in

(it's still incomplete... )

Antonialock commented 8 months ago

It's not entirely clear to me when to annotate to located_in ribosome and when to move the annotation to cytosol

does "located_in ribosome" mean that something associates with the ribosome (as opposed to "part_of ribosome")? - which I guess is reserved for ribosomal gene products?)

ValWood commented 8 months ago

Personally, I would only annotate subunits which are part of the active ribosome to "ribosome". and the others to cytosol, because the context should be provided by the process annotations (or in the cases above would usually be annotated to another component) Kimberly asked the same question above and I responded, but there was no follow-up to see if people agreed with this view.

Antonialock commented 8 months ago

I deleted some comments since I had missed the anatomial_entity parentage of ribosome, d'oh.

So I presume that any "actual" ribosomal proteins will be annotated to "ribosomal subunit x", and it's ok to annotate other proteins to the more general ribosome term?

pgaudet commented 8 months ago

does "located_in ribosome" mean that something associates with the ribosome (as opposed to "part_of ribosome")?

No; this only captures a localization that has been observed but for which there is no evidence that this is where a gene product is active.

The least confusing would be to make the ribosome (and the small and large subunits) a protein-containing complex.

The ribosome is_active in the cytosol, and therefore, any other peripheral proteins are also is_active in the cytosol.

I understand that we feel there is loss of information, since the cytosol seems pretty vague, but at the same time there are no cytosolic subcompartments, so it's as precise as it can be.

OK?

@vanaukenk Maybe it's worth going over this again on an annotation call?

Antonialock commented 8 months ago

Could translasome be a useful anatomical entity in lieu of ribosome? https://pubmed.ncbi.nlm.nih.gov/19818717/

"the translasome, which contains elongation factors, tRNA synthetases, 40S and 60S ribosomal proteins, chaperones, and the proteasome."

pgaudet commented 8 months ago

We could add that. I see two issues:

  1. This is very broad, I am not sure it's a useful CC
  2. Most importantly, since there are only 3 papers, I am worried this will be underannotated. Unless you want to annotate that big paper you cite (HTP?)

Pascale

Antonialock commented 8 months ago

No I don't particularly want to annotate any HTP papers for UniProt, that sounds painful :-)

I thought that perhaps it could be used in lieu of annotating all those "associates with polysomes" proteins to "ribosome" (if people have hard feelings of moving the annotations to cytosol)

pgaudet commented 8 months ago

It still doesn't fit the definition of ribosome; I am not sure how this would help the user.

pgaudet commented 8 months ago

GO:0043022 ribosome binding ?

ValWood commented 8 months ago

I see Antonia's point. ribosome, and the elongation factor complexes would become part_of translasome (so it would have indirect annotations), and would provide a place to put the ribosome-associated or other translation_associated components so they were not only "cytosol" for CC.

That could work, but I still think just cytosol is OK, because that is the correct specificity for the location.

vanaukenk commented 8 months ago

@vanaukenk Maybe it's worth going over this again on an annotation call?

I suggest we discuss this issue on an annotation and/or ontology call. Sometime after the GOC meeting this week :-)

tberardini commented 7 months ago

TAIR done - I ended up using polysome binding with fingers crossed that this won't be obsoleted as well. (but it may be)

Antonialock commented 7 months ago

Uniprot done

vanaukenk commented 6 months ago

WB is done.