Closed colinlog closed 1 year ago
Do we need such a grouping term?
These phosphorylation events (kinases) can just be coupled to their substrates, and then part_of the processes that they regulate (initiation, elongation, termination) etc.
This term seems like an extra level of classification.?
I do have a question though, whether these should be part_of or "regulation of" the respective processes. I have always been a bit unclear on this. I have used "regulation" but depends where we define the starts of initiation etc (and seems odd to be 'regulation as we decided earlier that these are "general initiation factors")
Regulation: Agreed, not regulatory as such. These are GTFs that are themselves regulated by dbTFs (GO:0003700 and descendants) and/or coTFs (GO:0003712 and descendants). However, note that CDK8/cyclinC CTD kinase is part of a coTF protein complex (the mediator), which is a complex that is activated by dbTFs, but that is not an issue, is it?
Part_of not good, has_part always works: The BP transcription initiation by RNA polymerase II GO:0006367 ALWAYS has_part MF RNA polymerase II C-terminal domain S5 kinase activity GO:0140836. In a discussion with Pascale, because there are combinatory modification aspects (eg; need to have S7p to efficiently make S5p) to the RNAP2 large subunit heptad repeat code that controls the RNA polymerase II cycle from initiation to termination and re-initiation by the enzyme at another promoter, the other relation, namely CTD-S5 kinase activity is part_of BP intiation is not always true.
Extra level: The CTD code is different from the histone code or from activating cascades because it involves 52 repeats in man (26 in yeast). Hence, this code distinguishes itself from all the other PTM-driven codes because the mechanisms underlying it are not only combinatorial at the level of one heptad repeat, they are also combinatorial at the level of the 52 repeats. Hence, I would think that the CTD modification functions (Y1kinase,S2kinase,P3isomerase,T4kinase,S5kinase,S5O-glc-NAc,P6isomerase,S7kinase,S7O-glcNAc) should have is_a relations to a parent MF term that could indicate that this concerns the CTD (and not just its phosphorylation, as there are promline isomerisation and serine O-glcNAc activities), for example RNA polymerase II large subunit carboxy-terminal domain (CTD) modification. Alternatively, we could house these activities for each CTD residue under the existing BP:0006366 transcription by RNA polymerase II with the relation has_part. I would like to argue that we can do both, a has_part to the BP and a is_a to the proposed çhapeau MF for CTD modifcation activities.
Not yet dealt with: There are variant repeats that bear a R residue at position 1 that can be methylated and others that bear a K residue at position 7 that can be methylated, acetylated or ubiquitylated. These appear to be involved in "snRNA and snoRNA regulation, R-loop resolution and transcription termination [R1me]" or that "Supports nucleosome occupancy at promoters; negatively regulates gene expression [K7me]" and that "Induction of growth-factor response genes, transcription elongation; maintains balance between Lys methylation and acetylation and affects mRNA expression levels [K7ac]" or that direct "RPB1 degradation [K7ubi]".
All this is very well summarized in PMID:28248323 [a table of repeats, a table of modifcations, a figure of where the CTD modifications are found along the genes, and references to the enzymes /complexes that are known to perform the modifications as well as readers of the modifications on the CTD
Proposal is to create a new term, 'RNA polymerase large subunit C-terminal domain modifying activity' to group all the activites. (We will do the same for histone modifiers). This is an unusual way to group terms but otherwise these terms are not easy to find.
Thanks, Pascale
Done in #24980
There may perhaps be a need to have an overarching BP term somewhere between the BPs GO:0016070 RNA metabolic process and GO:0006366 RNA polymerase II that can house the 'RNA polymerase II cycle' that molecularly consists of post-translational modifications of the YSPTSPS heptad at every residue (phosphorylation of Y1, S2, T4, S5, S7) and proline isomerisation (P3, P6). It is process because, like the histone code, there are non-polymerase sbunit proteins that write this code (many CDKs) and other proteins that read the CTD modification code to (i) regulate RNA polymerase II (sub)processessuch as transcription initiation, promoter clearance, elongation, termination and recycling and (ii) help the maturation or processing of the nascent RNA. Furthermore, there are non-consensus heptad repeats that are important to snRNA and snoRNA regulation, R-loop resolution and transcription termination (nice review is PMID:28248323)
Suggested term label: RNA polymerase large subunit CTD code
Definition (free text) The process of serial modification of CTD heptad repeats that drives RNA polymerase II promoter recruitment, transcription initiation, promoter clearance, elongation, termination and also specifically controls some ncRNA production
Reference, in format PMID: 28248323
Gene product name and ID to be annotated to this term
For human it is POLR2A, budding yeast RBP1
Parent term(s) probably GO:0006366 transcription by RNA polymerase II or it may be a parent of this as transcritpion is preceded by promoter binding and succeeded by post-termination events? The RNA polymerase II cycle / CTD code would encompass more than transcriprtion proper.
Children terms (if applicable) Should any existing terms that should be moved underneath this new proposed term?
Existing children: MFs for the modification? RNA polymerase transcription and all its children that are known to depend on the CTD code? recruitment of RNA capping, RNA splicing factors is also result of the CTD code too!
A new child BP would be The variant / non-consensus repeat modification code?
Synonyms (please specify, EXACT, BROAD, NARROW or RELATED)
[RNA polymerase II transcription cycle]
Cross-references
For enzymes, please provide RHEA and/or EC numbers.
Can also provide MetaCyc, KEGG, Wikipedia, and other links.
Any other information
The molecular functions for the enzymatic modification of every amino acid in the heptad repeat have been created and these should have part_of relations to this new BP.
One open question might be whether this should be a 'regulates' branchor really a BP?