geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
223 stars 40 forks source link

Transcription factor refactoring: redefine GO:0003712 transcription cofactor activity #15566

Closed pgaudet closed 6 years ago

pgaudet commented 6 years ago

Following up on #15536

The current definition of 'GO:0003712 transcription cofactor activity' states that 'cofactors mediate protein-protein interactions between regulatory transcription factors and the basal transcription machinery.'; however more recent literature shows that 'act through bringing enzymatic activities to the locality of the site of their recruitment to the genome. These enzymes either covalently modify chromatin, or result in chromatin reorganisation, see https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3963257/


and from https://github.com/geneontology/go-annotation/issues/1927

Which I [ @RLovering ] added last week: Specifically:

I am not happy with the definition for one of the recommended terms : GO:0001104 RNA polymerase II transcription cofactor activity

I am not willing to make the assumption that a co-factor TF that binds a specific TF will also bind a basal TF.

Basically the expts show a regulatory TF (ie cofactor) binding a specific TF, but no evidence for the cofactor binding to basal TFs. As there could be additional proteins in this complex I am not willing to use the term:GO:0001104 RNA polymerase II transcription cofactor activity.

Ruth


Current definition: Interacting selectively and non-covalently with a regulatory transcription factor and also with the basal transcription machinery in order to modulate transcription. Cofactors generally do not bind the template nucleic acid, but rather mediate protein-protein interactions between regulatory transcription factors and chromatin-modifying factors and/or the basal transcription machinery.

Proposed new definition: A protein that contributes to making the chromatin more or less accessible for transcription. Transcription coregulators are recruited to specific genomic elements by gene-specific transcription factors, by chromatin modifications, by DNA, and in some cases by regulatory RNAs. For example, one class of transcription coregulators modifies chromatin structure through covalent modification of histones. A second ATP dependent class modifies the conformation of chromatin. Transcription coregulators are characterized by a longer-lasting interaction with the chromatin, compared to that of gene-specific transcription factors.

https://en.wikipedia.org/wiki/Transcription_coregulator PMID:25957681

pgaudet commented 6 years ago

@krchristie @RLovering @ValWood

tagging you for feedback on the proposed definition above.

RLovering commented 6 years ago

Hi

I think this definition will make me more confident in applying it in an annotation. The child terms will also help people with making decisions about the application of this term too.

However, I don't think you should include the time aspect. How does a curator know this information? how do they know how long specific TFs associated with chromatin. Plus, I think the statement: interaction with the chromatin is problematic as we have decided that these will be part of the chromatin not binding the chromatin.

So when you discuss TFs etc please use the phrase 'association with the chromatin' rather than 'interaction with the chromatin'

Thanks Ruth

pgaudet commented 6 years ago

New proposed definition: A protein that contributes to making the chromatin more or less accessible for transcription. Transcription coregulators are recruited to specific genomic elements by gene-specific transcription factors, by chromatin modifications, by DNA, and in some cases by regulatory RNAs. For example, one class of transcription coregulators modifies chromatin structure through covalent modification of histones. A second ATP dependent class modifies the conformation of chromatin.


OK ? @RLovering @krchristie @ValWood

Thanks, Pascale

ValWood commented 6 years ago

Hi, This is now confusing. It increases the scope of what we previously annotated to "GO:0003712 transcription cofactor activity (although I'm not totally certain where the boundary is, below are some examples of things which would be included in the revised definition).

The new definition does not exclude any chromatin modifier involved in transcription. This would include things already annotated with other activities. For example https://www.pombase.org/gene/SPBC36.05c clr6, a histone deacetylase https://www.pombase.org/gene/SPCC1919.15 brl1 a ubiquitin ligase https://www.pombase.org/gene/SPAC139.06 hat1 a silencing factor or the entire RNAi machinery.

I only pulled out a few here, but if we consistently annotated these to the , many GPs would have 2 different annotations describing the same function. I think that minimally the GO:0003712 transcription cofactor activity term needs to describe something that is closely associated with the RNA polymerase machinery? and not just generally involved in chromatin accessibility ? (otherwise it just means any gene product involved in transcription?).

pgaudet commented 6 years ago

Hi @ValWood According to PMID:25957681, the histone deacetylase would certainly be a co-factor. Those are closely associated with the RNA polymerase machinery (at least as much as what they call the 'gene-specific transcription factors', see screenshot

image

Histone deacetylase could probably be a child of transcription co-factor - would that help ? In this case you wouldn't need to make 2 annotations.

Thanks, Pascale

ValWood commented 6 years ago

well this specific histone deacetylase is a co-factor (Swi-SNF is the only chromatin remodeller we have currently annotate as a "transcription regulator"). This does have concurrent annotation even though I don't like it (more room for inconsistency, and unclear what we mean)

However, there are other histone deacetylases involved in heterochromatin formation at
constitutive heterochromatin that are ~not~ sometimes associated with pol II, but are involved in transcriptional repression. It would seem odd to see these as "GO:0003712 transcription cofactor activity" but they are not currently excluded by your definition.

Maybe "gene specific transcription" should be part of the definition?

Or the fact that the regulator must be associated with the promoter or the enhancer and connected to PolII?

If there are any histone marks that are only used in the context of GO:0003712 transcription cofactor activity (i.e directly associated with polII) then the histone modification activities can go under "GO:0003712 transcription cofactor activity", but off the top of my head I'm not sure that there are?

ValWood commented 6 years ago

How about

Transcription coregulators are recruited to gene-specific enhancer or promoter elements ~by gene-specific transcription factors, by chromatin modifications, by DNA (1*), and in some cases by regulatory RNAs (2**).~ and make chromatin more or less accessible for transcription.

~For example, one class of transcription coregulators modifies chromatin structure through covalent modification of histones. A second ATP dependent class modifies the conformation of chromatin.~

questions 1 How 2 how? is the RNA doing the actual 'recruiting' here? in RNAi for example the RNA is setting up an amplification cycle for chromatin modification which is doing the recriuting

ValWood commented 6 years ago

keeping it simple: Transcription coregulators are recruited to gene-specific enhancer or promoter proximal regions ~elements~ by gene-specific transcription factors and or chromatin modifications and make chromatin more or less accessible for transcription.

Is this necessary and sufficient? If not what is missing or included that should not be? I don't think we should add so much about mechanisms as it all gets a bit circular and it is still under study....

ValWood commented 6 years ago

For example you are excluding the stuff bound at the promoter right? (GTF's)?

I put "enhancer or promoter proximal regions" since in yeast we don't have "enhancers"

What about bacteria? Would they use this term?

ValWood commented 6 years ago

My concern is that we would not traditionally annotate this pathway as "transcription co-regulators" https://moazed.med.harvard.edu/node/rna-and-heterochromatin-assembly

because this not a "gene specific" phenomena. People don't really call these chromatin modifiers "transcription co-regulators" (at least in pombeland)

ValWood commented 6 years ago

sorry, I keep thinking...

https://en.wikipedia.org/wiki/Transcription_coregulator

Interacting with a "gene specific transcription factor" seems to be key...

The other stuff in the original proposed def isn't really always describing "Transcription coregulators"

(this excludes the RNAi mediated heterochromatin assembly that was being included)

pgaudet commented 6 years ago

Don't you consider the Mediator complex a co-factor ?

ValWood commented 6 years ago

yes, need to add in a clause to cover mediator (adaptor between PolII and gene specific transcription factor?).

we only need to make sure we exclude genes in this scenario: https://moazed.med.harvard.edu/node/rna-and-heterochromatin-assembly

The first def was a bit misleading because it focussed on the accessibility which is also key in none "transcription co-regulator" scenarios

As long as the def can only be applied to the none-GTF component in your figure above, that's fine.

pgaudet commented 6 years ago

Should we add a comment along the lines of 'note that this does not include chromatin assembly factors'?

ValWood commented 6 years ago

I don't think so because it can ( that's the problem). I think we need a very precise def here to exclude any "factors" which are not associated directly, or indirectly (like mediator) with gene-specific transcription. The key seems to be to focus on the primary requirement which is the "direct or indirect gene-specific TF association". If this is clear, it should be fine without adding exclusions....

the problem came when you introduced:

"A protein that contributes to making the chromatin more or less accessible for transcription. Transcription co-regulators are recruited to specific genomic elements by gene-specific transcription factors, by chromatin modifications, by DNA, and in some cases by regulatory RNAs."

because the heterochromatin machinery fits this definition. It does not make "gene-specific transcription factors" a primary non-negotiable requirement.

ValWood commented 6 years ago

so I don't think this part is necessary

"by chromatin modifications, by DNA, and in some cases by regulatory RNAs"

I don't know of cases where the co-regulators are recruited by DNA or regulatory RNA's, they may exist, but it would be clearer not to mention this and focus on what must be there, and that is a connection to the gene specific TF.

When I read the proposed def it suggests to me that the "gene-specific transcription factor" is not a requirement in this scenario.

krchristie commented 6 years ago

Having just taken a quick scan through the paper Pascale included the link to in the very first comment in this ticket, it seems that these authors are taking a very broad view of what a transcription coregulator is. Since this is broader than I've seen before, I do wonder if this view is now widespread...

But if you've decided to go with this broad view, I don't know that I agree with @ValWood that "gene-specific transcription factors" need to be a primary non-negotiable requirement. My impression is that when people use the phrase "gene-specific transcription", they are referring to factors that regulate a fairly limited set of genes such that when you mutate one factor, you see specific effects on a limited set of genes. However, it seems to me that coregulators could also be affecting much larger sets of genes when they are recruited by transcription factors that regulate larger sets of genes. The nomenclature "general transcription factors" is a little bit of a misnomer since they are not all required at all promoters (see https://www.ncbi.nlm.nih.gov/pubmed/19411170), but instead are required at a large set of promoters. So coregulators recruited by basal factors might look like they have very broad effects rather than "gene specific" effects.

ValWood commented 6 years ago

However, it seems to me that coregulators could also be affecting much larger sets of genes when they are recruited by transcription factors that regulate larger sets of genes.

Hi @krchristie , here I just mean "the transcription of individual genes" it could be all of them, or many of them , or few of them. What I am trying to ensure is that the definition for the coregulator excludes the gene products which regulate transcription by making repressive chromatin. This could be at subtelomeric chromatin, centromeres of other gene free regions. These factors may be closely associated with Pol II but not to transcribe individual genes. Perhaps it might be better to say "coregulators are involved in the transcription of genes from specific promoters" rather then mention their association with "gene-specific transcription factors"?

These genes are negatively regulating transcription by forming repressive heterochromatin, but they are not all polymerase associated, and they are not associated with specific promoters: for example by mechanisms reviewed in https://www.nature.com/articles/nrm.2017.119

pgaudet commented 6 years ago

(just for info, the part of the definition "These complexes are recruited to specific genomic elements by GSTFs, by chromatin modifications, by DNA, and in some cases by regulatory RNAs. ", was taken from PMID:25957681). I have no problem removing that sentence, we can always add it back if we feel the definition becomes to narrow.

Pascale

ValWood commented 6 years ago

Yes, even if it is the case, it opens up transcription regulator to be "anything involved in transcription, including ALL chromatin remodellers". I'm not sure that we want to do that here?

If we really are trying to group this specific group of proteins in the figure maybe we need to be more explicit in the term name i.e gene specific transcription factor associated cofactor activity (clunky but explicit)

As Karen mentioned, people do use the corepressor and co-activator terms in other contexts but it is much less common....

I also don't have a problem with the line between what is a "general TF" and what is a "gene specific transcription factor associated cofactor activity" being murky. I would be happy to annotate with what specific experiments showed.

ValWood commented 6 years ago

In the figure what are " BTAF1/Mot1p and NC2 can remove TBP from the promoter. Intrinsically mobile proteins are indicated in red..." ?

I'm trying to identify the queries to pull out the functional groups transcription in this model (and other transcription related stuff....)

pgaudet commented 6 years ago

After further discussion with Colin Logie, Astrid Laegrid, @RLovering and @ValWood , this is the new proposed definition:

Transcription coregulator activity (synonym: transcription cofactor activity) A protein or a member of a complex that interacts specifically and non-covalently with a DNA-binding transcription factor to either activate or repress the transcription of specific genes. Coregulators often act by altering chromatin structure and modifications. For example, one class of transcription coregulators modifies chromatin structure through covalent modification of histones. A second ATP-dependent class modifies the conformation of chromatin. Another type of coregulator activity is the bridging of a DNA-binding transcription factor to the basal transcription machinery. The Mediator complex, which bridges transcription factors and RNA polymerase, is also a transcription coregulator. PMID:25957681; PMC3963257; https://en.wikipedia.org/wiki/Transcription_coregulator

Thanks, Pascale

pgaudet commented 6 years ago

Transcription coactivator activity: Old def: Interacting selectively and non-covalently with an activating transcription factor and also with the basal transcription machinery in order to increase the frequency, rate or extent of transcription. Cofactors generally do not bind the template nucleic acid, but rather mediate protein-protein interactions between activating transcription factors and the basal transcription machinery.

New def: A protein or a member of a complex that interacts specifically and non-covalently with a DNA-binding transcription factor to activate the transcription of specific genes. Coregulators often act by altering chromatin structure and modifications. The Mediator complex, which bridges transcription factors and RNA polymerase, is also a transcription coactivator.


Transcription coactivator activity: Old def: Interacting selectively and non-covalently with a repressing transcription factor and also with the basal transcription machinery in order to stop, prevent, or reduce the frequency, rate or extent of transcription. Cofactors generally do not bind the template nucleic acid, but rather mediate protein-protein interactions between repressive transcription factors and the basal transcription machinery.

New def: A protein or a member of a complex that interacts specifically and non-covalently with a DNA-binding transcription factor to repress the transcription of specific genes. Coregulators often act by altering chromatin structure and modifications.

RLovering commented 6 years ago

Hi All

I was just thinking of this definition. If we go back to the figure and we agree that HAT should be annotated as transcription coregulators then I think the definition is still not right, because I do not think that all proteins you want to associate with this term actually bind the dbTF. I have deleted 'interacts specifically and non-covalently with a" and replaced it with a bit of text moderates the activity of a

I think the following would work better:

Transcription coregulator activity (synonym: transcription cofactor activity) A protein or a member of a complex that moderates the activity of a gene-specific transcription factor (aka DNA binding transcription factors) to either activate or repress the transcription of specific genes (https://en.wikipedia.org/wiki/Transcription_coregulator). Coregulators often act by altering chromatin structure and modifications. For example, one class of transcription coregulators modifies chromatin structure through covalent modification of histones. A second ATP dependent class modifies the conformation of chromatin. Another type of coregulator activity is the bridging of a gene-specific transcription factor to the basal transcription machinery. The Mediator complex, which bridges transcription factors and RNA polymerase, is also a transcription coregulator PMID:25957681; PMC3963257

ValWood commented 6 years ago

This then becomes  less specific than intended. It would not exclude upstream signalling pathways, or things like  oxygen sensors and thioredoxins  or even proteases and ligases that can bind to GSTF's and  regulate their DNA binding activity directly. I would avoid "modulates" and anything which refers to the "activity" of the transcription factor,...we  need to say somehow that it needs  to be associated with the GSTF in some way enabling it to regulate txn from pol II

moderates the activity of a gene-specific transcription factor

RLovering commented 6 years ago

but from the figure it is clear that the protein doesn't have to be associated with the dbTF. I guess it depends how the complex is defined, but if we are relying on one molecule in the complex contacting the dbTF then this needs to be stated. PLUS it is rarely expt shown that the co-regulator actually contacts the dbTF. By including this in the definition you are implying that the interaction with the dbTF has to occur otherwise this is not a coregulator.

This is why I was trying to include more descriptive child terms, as this would help curators see the range of proteins that can be associated with the term.

To get rid of the signaling problem can the definition include a statement like: Only proteins which are part of the chromatin (if chromatin continues to be defined by GO: The ordered and organized complex of DNA, protein, and sometimes RNA, that forms the chromosome) or chromatin binding should be annotated to this term.

pgaudet commented 6 years ago

@RLovering I think no matter what we do, annontating to these terms is difficult. How about if we left the definition as it currently is, and after the GO meeting look at some annotation examples to see how annotation documentation (and perhaps more improvements on the definition) might help?

Thanks, Pascale

RLovering commented 6 years ago

OK

ValWood commented 6 years ago

A protein or a member of a complex that interacts non-covalently with a DNA-binding transcription factor, or one of its bound complexes, to either activate or repress the transcription of specific genes by RNA polymerase II

? maybe

This gets the fact that there needs to be some interaction, and it needs to happen at the same time. Also we should day pol II in this def?

krchristie commented 6 years ago

We should NOT say RNA polymerase II in these definitions because the "activities":

are also relevant to bacterial transcription.

ValWood commented 6 years ago

.... I always forget about bacteria...

pgaudet commented 6 years ago

Definition implemented: A protein or a member of a complex that interacts specifically and non-covalently with a DNA-binding transcription factor to either activate or repress the transcription of specific genes. Coregulators often act by altering chromatin structure and modifications. For example, one class of transcription coregulators modifies chromatin structure through covalent modification of histones. A second ATP-dependent class modifies the conformation of chromatin. Another type of coregulator activity is the bridging of a DNA-binding transcription factor to the basal transcription machinery. The Mediator complex, which bridges transcription factors and RNA polymerase, is also a transcription coregulator (specifically, a coactivator).