Closed ValWood closed 1 year ago
~I wonder if it's the same as~
@krchristie
Thanks, Pascale
@colinlog : No, it's not the same
Without looking into the details, it sounds like one of those situations where the core FUNCTION is conserved but dep on taxon, tissue, cellular circumstances etc the composition is variable. We have curated the yeast complex: https://www.ebi.ac.uk/complexportal/complex/CPX-2662 and annotated to GO:0005665 DNA-directed RNA polymerase II, core complex https://www.ebi.ac.uk/QuickGO/term/GO:0005665
while a lot of the general TFs and core mediator (in several species) are annotated to: GO:0016591 DNA-directed RNA polymerase II, holoenzyme https://www.ebi.ac.uk/QuickGO/term/GO:0016591
We have no annotations to GO:0097550 transcriptional preinitiation complex https://www.ebi.ac.uk/QuickGO/term/GO:0097550
The yeast RNAPII refers to the PIC in the description
During a transcription cycle, Pol II, general transcription factors and the mediator complex (CPX-3226) assemble as the preinitiation complex (PIC) at the promoter.
but we didn't curate the PIC itself because it is difficult to define as RNAPII, general TFs and core mediator come and go during transcription initiation.
There are only 10 EXP annotations total to GO:0097550 transcriptional preinitiation complex 5 CAFA 4 SGD 1 uniprot. It seems sensible to get rid of this ....
OK
while a lot of the general TFs and core mediator (in several species) are annotated to: GO:0016591 DNA-directed RNA polymerase II, holoenzyme https://www.ebi.ac.uk/QuickGO/term/GO:0016591
Sorry, they are annotated to the specific CHILDREN of this term! e.g. TFIIIB complex, core mediator complex...
According to wikipedia: https://en.wikipedia.org/wiki/Transcription_preinitiation_complex is like the holoenzyme (GO:0016591)? The definition is "A nuclear DNA-directed RNA polymerase complex containing an RNA polymerase II core enzyme as well as additional proteins and transcription factor complexes, that are capable of promoter recognition and transcription initiation from an RNA polymerase II promoter in vivo. These additional components may include general transcription factor complexes TFIIA, TFIID, TFIIE, TFIIF, or TFIIH, as well as Mediator, SWI/SNF, GCN5, or SRBs and confer the ability to recognize promoters."
That does sound like the holoenzyme and PIC are regarded as the same thing. In which case it's a grouping term for the specific TFs, Pols and mediator etc.
OK so I will merge rather than obsolete.
@ValWood @krchristie OK ?
Thant sounds sensible if Karen agrees.
Another question: The term has subclasses:
'protein-containing complex' and ('capable of' some 'DNA-directed 5'-3' RNA polymerase activity')
however the definition really talks about 'transcription initiation', so I think these relations should go.
(Edited to add): Based on the definitions, the 'DNA-directed RNA polymerase II, core complex' should have the 'capable of' some 'DNA-directed 5'-3' RNA polymerase activity' ("... Although the core is competent to mediate ribonucleic acid synthesis, it requires additional factors to select the appropriate template.")
@ValWood @krchristie what do you think ?
A second question: DNA-directed RNA polymerase II, holoenzyme is a subclass of 'nucleoplasm'; that also seems wrong, given that its a complex with the DNA.
Should it be part of chromatin ?
Thanks, Pascale
"DNA-directed RNA polymerase II, holoenzyme" is NOT necessarily bound to DNA. In fact it has been purified as a large protein complex not attached to DNA many, many different times. There have also been models that propose that the holoenzyme preassembles before binding to DNA, so, unless you have new evidence that disproves that idea, I think that it is not OK to specify that the RNAP II holoenzyme is bound to DNA.
Also worth being aware that there is a long-standing issue about the fact that the composition of "the" holoenzyme appears to be incredibly variable. Thus, it is not a single complex, but rather any of a number of complexes that contain RNAP II and some number of other factors that are required to bring it to the promoter.
Based on the definitions, the 'DNA-directed RNA polymerase II, core complex' should have the 'capable of' some 'DNA-directed 5'-3' RNA polymerase activity' ("... Although the core is competent to mediate ribonucleic acid synthesis, it requires additional factors to select the appropriate template.")
I'm inclined to agree that
"DNA-directed RNA polymerase II, holoenzyme" is NOT necessarily bound to DNA.
The same is true for many complexes, but we building location of activity into GO seems fine to me? I don't have a problem with any nuclear transcription factor complexes being a child of "chromatin".
I think this is very dangerous. The ontology is meant to represent universals. It seems part of is not the right relationship for what you want. It is universally true that the complex is always part of the nucleoplasm.
Is that what we always do though?
Maybe we do. Nearly. There are no complexes under chromatin, except "nucleosome"
Because nucleosomes are always part of chromatin.
So there are no free nucleosomes? OK... Makes sense...
Nucleosome: A complex comprised of DNA wound around a multisubunit core and associated proteins, which forms the primary packing unit of DNA into higher order structures.
I was about to say the same thing as @ukemi
OK!
'nuclear transcription factor complex' is 'part of' some nucleus'.
We should do the same with the others (GTFs), I think.
Well - DNA-directed pol II and IV complexes are 'part of the nucleoplasm', which excludes the chromosomes. Is this right ?
Isn't it the case that these proteins are active at the chromatin but can also be found in free form? Aren't we capturing the active form ?
(Safest seems to be 'nuclear' for all).
Pascale
Current structure is: 'nuclear DNA-directed RNA polymerase complex'
If I move alpha DNA polymerase:primase complex' out (see #15977) I can add 'capable of' some 'DNA-directed 5'-3' RNA polymerase activity' to the parent 'nuclear DNA-directed RNA polymerase complex' (and since the core complex is part_of the holoenzyme, the core complex anyways inherits the activity).
Pascale
Safest seems to be 'nuclear' for all.
I think so.
If I move alpha DNA polymerase:primase complex' I can add 'capable of' some 'DNA-directed 5'-3' RNA polymerase activity' to the parent 'nuclear DNA-directed RNA polymerase complex'
GO:0055029 nuclear DNA-directed RNA polymerase complex "A protein complex, located in the nucleus, that possesses DNA-directed RNA polymerase activity."
That holds true for 'alpha DNA polymerase:primase complex' so can't move unless you make the def of 'nuclear DNA-directed RNA polymerase complex' stricter and exclude synthesis of short RNA strand.
and since the core complex is part_of the holoenzyme, the core complex anyway inherits the activity.
Aren't we merging them? Or just holoenzyme and PIC? I don't like the core vs holo- distinction as it has a fluid boundary and is used mainly by experimentalists to distinguish between persistently-found proteins and those that come and go dep on the cellular circumstances. Often you don't find the core complex on it's own, at least not as the functional unit.
These labels should be simplified from 'DNA-directed RNA polymerase (...)', to
'RNA polymerase I complex' 'RNA polymerase II, core complex' 'RNA polymerase II, holoenzyme' 'RNA polymerase III complex' 'RNA polymerase IV complex' 'RNA polymerase V complex' (since the RNA-directed RNA polymerases are called 'RNA-directed RNA polymerases', and not I, II, III, IV, V, there is no risk of confusion)
(Done in #15980)
@bmeldal So you suggest I do a 3-way merge between PIC, core and holenzyme ? (that works for me)
It's safer as it reflects biology better, I don't know that the others thinks???
(I've come across quite a few examples of that "behaviour' when curating poorly-defined epigenetic complexes. They seem to be poorly defined as they are a) difficult to purify and b) vary dep on cellular circumstances. And then there are the technical artifacts to consider - just because you don't see one of the proteins one day doesn't mean it's not there, it just didn't get pulled down under the conditions. Change the conditions and you get a slightly different complex... And then you get the complexes defined on what has been detected on a Western based on expected members! No effort to identify ALL potential members but drawing some strong conclusions about the complex composition. Nightmare!!!)
Reflecting the biology is great !!
Reflecting the biology is great !!
Isn't that our job?
OK, 3-way merge it is, and the label will be 'DNA-directed RNA polymerase II'. Let me know if that's not right.
In fact I think the holoenzyme should be obsoleted (as @ukemi also suggested in #10556). We cannot be sure of whether the proteins annotated to the holoenzyme are part of PolII.
There are
(9) PomBase @ValWood (9) SGD @krchristie @suzialeksander (9) UniProt @ggeorghiou @sylvainpoux (1) TAIR @tberardini
Let me know if a merge would be OK.
Thanks, Pascale
Can you summarize, I got a bit lost.
You aren't doing anything with mediator are you?
I agree that we don't need "holoenzyme"
I think mediator needs to be moved out from RNA pol II complex, see #15979
This is what I would prefer
keep
DNA-directed RNA polymerase II, core complex https://www.pombase.org/term/GO:0005665
mediator https://www.pombase.org/term/GO:0016592 (I don't see this as part of any polymerase term)
get rid of DNA-directed RNA polymerase II, holoenzyme (GO:0016591) https://www.pombase.org/term/GO:0016591
"DNA-directed RNA polymerase II, core complex" --> "DNA-directed RNA polymerase II" (remove "core")
Mediator and integrator complexes are not a type of RNAPII, they interact with it: https://github.com/geneontology/go-ontology/issues/15979
"DNA-directed RNA polymerase II, core complex" --> "DNA-directed RNA polymerase II" (remove "core")
is only used for the polymerase itself. There shouldn't be any mediator subunits annotated to this term.
I don't think you can merge DNA-directed RNA polymerase II, holoenzyme (GO:0016591) into any existing term.
Most of our annotations to this term would not fit the existing subcomplexes. I will fix them. I think everyone should check, it's been used somewhat loosely....
I don't think you can merge DNA-directed RNA polymerase II, holoenzyme (GO:0016591) into any existing term.
Most of our annotations to this term would not fit the existing subcomplexes. I will fix them. I think everyone should check, it's been used somewhat loosely....
pombase/curation#2067
Ok, I guess it needs an annotation revision ticket, @pgaudet
Discussing with @ValWood In fact several annotations to 'transcriptional preinitiation complex' are incorrect, so we will propose to obsolete.
Core RNAP II versus holoenzyme does not have a fluid boundary. Core is very simply defined as the 12 subunit enzyme. It is definitely NOT equivalent to either the PIC or the holoenzyme. I think it would be a truly bad idea to merge core and holoeyzme. The core term has a very useful role of definining the composition of the RNAP II enzyme itself.
Holoeyzme has multiple different compositions, which all include RNAP II core, as well as a varying composition of other transcription complexes. I have proposed obsoleting it previously due to the inability to define it precisely. However, this has been strenuously objected to by multiple people because the phrase "holoenzyme" is used frequently in the literature. @RLovering may have an opinion on this.
I would think that incorrect annotations to a term would be justification for fixing the annotations, not necessarily for obsoleting the term.
Just because experimentalists call things core and holo doesn't mean it reflects biology. In epigenetic complexes they call the catalytic core components "core" as they are always there and necessary but they never act on their own in the cell (they may be functional on their own in vitro).
We should describe what's happening in the cell.
The mediator is different, the CDK subcomplex does come and go and gives the mediator a different function.
Core RNAP II versus holoenzyme does not have a fluid boundary. Core is very simply defined as the 12 subunit enzyme.
Not according to the GO def: " RNA polymerase II, one of three nuclear DNA-directed RNA polymerases found in all eukaryotes, is a multisubunit complex; typically it produces mRNAs, snoRNAs, and some of the snRNAs. Two large subunits comprise the most conserved portion including the catalytic site and share similarity with other eukaryotic and bacterial multisubunit RNA polymerases. The largest subunit of RNA polymerase II contains an essential carboxyl-terminal domain (CTD) composed of a variable number of heptapeptide repeats (YSPTSPS). The remainder of the complex is composed of smaller subunits (generally ten or more), some of which are also found in RNA polymerases I and III. Although the core is competent to mediate ribonucleic acid synthesis, it requires additional factors to select the appropriate template."
No list of core subunits (in bold) and a clear comment that it required additional subunit for its full activity (in bold).
Therefore, "holoenzyme" subunits can be annotated to this term.
And PIC is very badly defined:
GO:0097550 transcriptional preinitiation complex "A protein-DNA complex composed of proteins binding promoter DNA to form the transcriptional preinitiation complex (PIC), the formation of which is a prerequisite for transcription."
As mentioned above, we didn't curate the PIC as it's impossible to define.
That sentence you have highlighted in the definition above does NOT mean that holoenzyme components should be annotated to the term for core RNAP II.
It refers to the fact that RNAP II is competent for the FUNCTION of the enzyme activity, but that other things are required for the PROCESS of transcription.
It becomes very problematic to have all the things that have been annotated to holoenzyme start get annotated to something that is 'capable of' RNA polymerase activity, because NONE of these other things are capable of this activity, nor do they contribute to the activity. Rather, they act to bring the enzyme to the correct place.
What I proposed in the end is to
As the exchange shows, PIC and holoenzyme are not clear complexes. I read the definition for core the same as @krchristie , but if you have suggestions to clarify the meaning @bmeldal , please send them (perhaps the sentence in bold is actually more confusing than helpful).
Thanks, Pascale
Emailed Colin, Astrid, Marcio, Ruth and Val.
"Although the core is competent to mediate ribonucleic acid synthesis, it requires additional factors to select the appropriate template."
It refers to the fact that RNAP II is competent for the FUNCTION of the enzyme activity, but that other things are required for the PROCESS of transcription.
As the exchange shows, PIC and holoenzyme are not clear complexes. I read the definition for core the same as @krchristie , but if you have suggestions to clarify the meaning @bmeldal , please send them (perhaps the sentence in bold is actually more confusing than helpful).
I read it as referring to the COMPOSITION as we are discussing the COMPONENT term. Where did I go wrong?
It becomes very problematic to have all the things that have been annotated to holoenzyme start get annotated to something that is 'capable of' RNA polymerase activity, because NONE of these other things are capable of this activity, nor do they contribute to the activity. Rather, they act to bring the enzyme to the correct place.
I would never suggest to annotate the non-catalytic components to RNAP ACTIVITY but we are discussing the COMPONENT term here. Which complex COMPONENT term will you annotate the old holoenzyme and PIC components to now? If they are chaperones etc they should never have been annotated to a complex term anyway. But how do you annotate components that are truly part of the RNAPII complex but not part of the conserved core and not a chaperone? @ValWood?
Which complex COMPONENT term will you annotate the old holoenzyme and PIC components to now? If they are chaperones etc they should never have been annotated to a complex term anyway. But how do you annotate components that are truly part of the RNAPII complex but not part of the conserved core and not a chaperone?
I don't currently have any examples which would not be annotated to one of the subcomplexes, with the exception of the CTD phosphatase fcp1. I would be happy for this not to have a holoenzyme annotation since it is connected via substrate "rpb1". This phosphatase may be more promiscuous (I don't know, but it relocalizes to the cytosol during hypoxia).
I'm a bit ambivalent whether we keep the term or not if people think we should have it as a grouping term, I'm happy for it to stay....but we need to be precise about what would be annotated to it.....
This is what I have https://www.pombase.org/term/GO:0016591
It might be useful to know what would not be annotated without the "holoenzyme" term. Then we would know if we needed it.
transcriptional preinitiation complex A protein-DNA complex composed of proteins binding promoter DNA to form the transcriptional preinitiation complex (PIC), the formation of which is a prerequisite for transcription. PMID:22751016
has no children and only 704 annotations in total (10 experimental SGD and CAFA)
SGD annotations are SUA7 (TFIIB), TFA1 (TFIIE), TFA2 (TFIIE), SSL2 (TFIIH)
There must be an issue with the definition/placement (should it have children? should it exist?)