geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
220 stars 40 forks source link

broaden MF terms for capping activities #26939

Open ValWood opened 7 months ago

ValWood commented 7 months ago

@pgaudet I think this is what you had in mind.

  1. GO:0140818 mRNA 5'-phosphatase activity
  2. GO:0004484 mRNA guanylyltransferase activity
  3. GO:0004482 mRNA 5'-cap (guanine-N7-)-methyltransferase activity

are the first 3 steps in GO:0006370 7-methylguanosine mRNA capping

however these enzymes are not specific for mRNA, (also used for other pol II transcribed snRNA/snoRNA/some lncRNA but with additional steps to make different caps)

propose broadening these to

GO:0140818 RNA 5'-phosphatase activity GO:0004484 RNA guanylyltransferase activity GO:0004482 RNA 5'-cap (guanine-N7-)-methyltransferase activity

The specific pathways (which diverge after these 3 steps) will be described using the associated BP terms

i.e. GO:0140818 RNA 5'-phosphatase activity part_of GO:0006370 7-methylguanosine mRNA capping GO:0140818 RNA 5'-phosphatase activity part_of sno/snRNA TGS cap formation

etc

ValWood commented 7 months ago

This was agreed after discussion with @pgaudet

ValWood commented 7 months ago

@pgaudet
The EC and RHEA xrefs will now be too specific, how do I deal with that?

e.g https://www.ebi.ac.uk/QuickGO/term/GO:0140818

the other possible issue is that the human and yeast enzymes which perform at least the first 2 steps are not orthologous, so I'm not sure if there are other species differences and the higher eukaryotic versions are specific for RNA?

pgaudet commented 7 months ago

I'm not sure if there are other species differences and the higher eukaryotic versions are specific for RNA?

In the first comment you write "however these enzymes are not specific for mRNA, (also used for other pol II transcribed snRNA/snoRNA/some lncRNA but with additional steps to make different caps)" >> sounds like this is not clear?

What I would do as a first step is to check which enzymes catalyze mRNA, snRNA and snoRNA dephosphorylation, and see what EC/RHEA these are annotated to.

ValWood commented 7 months ago

It's the same 3 enzymes for the first part, but the enzyme which do these steps are only described I asked an expert about this previously :m7G as normal (cap 0) and then tgs1 converts that to 2,2,7 tri-methylG. At least I know that for snRNAs; I presume it’s the same for snoRNAs. "

so, It's well known that the same 3 enzymes form the "7-methylguanosine RNA cap" (this has to be the case because it is added to the nascent RNA, when there is no information to distinguish RNA type)

but then the sn/snoRNA is further modified to produce the 2,2,7-trimethylguanosine (TMG) cap (I think this is signalled by a stem-loop structure present in the elongation sno/sno)

The problem is that these have never been annotated anywhere. GO:1990273 snRNA capping has zero annotations

there are 8 annotations to the GO:0036261 7-methylguanosine cap hypermethylation which represents only the second part of sn/snoRNA cap formation but nobody has yet annotated the first 3 steps in this pathway to snRNA capping

So is it better to remove RNA from the terms above or create a more general term for each?

The genes are shared except for the GO:0140818 mRNA 5'-phosphatase activity

Screenshot 2024-02-07 at 12 12 17

which in human is also catalysed by RNGTT

In human there is another enzyme DUSP11 which might catalyse the first reaction https://amigo.geneontology.org/amigo/reference/PMID:10347225 but this is really unclear because it is also annotated as a protein phosphatase

ValWood commented 7 months ago
  1. GO:0140818 mRNA 5'-phosphatase activity A 5'-end triphospho-[mRNA] + H2O = a 5'-end diphospho-[mRNA] + H+ + phosphate. PMID:9473487

there is a non-mRNA specific term:

GO:0004651 | polynucleotide 5'-phosphatase activity Catalysis of the reaction: 5'-phosphopolynucleotide + H2O = polynucleotide + phosphate.

GO:0004651 has only been used for mRNA 5'-phosphatase activity 8 experimental annotations S. cerevisiae CET1(2) S. cerevisiae CTL1 (1) C.albicans CET1 (2) C. elegans cel-1 (2) human DUSP11 (1)

However, 1. GO:0004651 | polynucleotide 5'-phosphatase activity is not a parent of GO:0140818 mRNA 5'-phosphatase activity

and 2. The Rhea reaction doesn't match the definition:

Screenshot 2024-02-07 at 12 09 55

because the reaction is for mRNA

SUGGEST

  1. standardize defs
  2. Ask Rhea to make reaction GO:0004651 | polynucleotide 5'-phosphatase activity non mRNA specific?
  3. make GO:0004651 | polynucleotide 5'-phosphatase activity a parent of GO:0140818 mRNA 5'-phosphatase activity

@pgaudet

ValWood commented 7 months ago
  1. mRNA guanylyltransferase activity (GO:0004484) is defined Catalysis of the reaction: GTP + (5')pp-Pur-mRNA = diphosphate + G(5')ppp-Pur-mRNA; G(5')ppp-Pur-mRNA is mRNA containing a guanosine residue linked 5' through three phosphates to the 5' position of the terminal residue.

there is a more general term which could be used for snoRNA/snRNA: RNA guanylyltransferase activity (GO:0008192) Catalysis of the posttranscriptional addition of a guanyl residue to the 5' end of an RNA molecule.

but this would be confusing because it does not specify the 5'5' linkage

The disadvantage of this term is that it is also a parent of tRNA guanylyltransferase activity (GO:0008193) which is not a 5-5 linkage (this seems to be just adding a base) so it does not group the pol II related activities.

I could add a new parent term : RNA 5'-5'-guanylyltransferase activity (GO:0008192) Catalysis of the reaction: GTP + (5')pp-Pur-RNA = diphosphate + G(5')ppp-Pur-RNA; G(5')ppp-Pur-RNA is RNA containing a guanosine residue linked 5' through three phosphates to the 5' position of the terminal residue. This activity is part of pol II-dependent RNA capping of mRNA, snRNA, snoRNA and some lncRNAs including telomase)

This would provide a grouping term for the pol II associated activities (mRNA, snoRNA, snRNA, telomerase). Or, do I need to add these as children explicitly? It seems unnecessary when it is all the same enzyme?

@pgaudet

ValWood commented 7 months ago

Rhea 67012 add GO:0008192

delete Rhea on child (will be merged)

add xref /add sKos narrow match

make parent definition include snRNA/snoRNA/mRNA