geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
223 stars 40 forks source link

Bacterial transcription factor activity - are those term needed? #16736

Closed pgaudet closed 5 years ago

pgaudet commented 6 years ago

Hello,

This is a question for everyone annotating bacteria @jimhu-tamu @sandyl27 @pedruzzi @ivanerill @AndreaAuchincloss @keseler

Is it useful to annotate bacterial transcription factors to 'bacterial-type RNA polymerase transcription factor activity' (and activator/repressor children) ? I can see in eukaryotes how people want to distinguish mitochondrial, plastid, and nuclear transcription factors, but how about in bacteria?

Exact terms are:

(Other terms will be/were merged in https://github.com/geneontology/go-ontology/issues/16728)

Looking at annotations, only CollectTF has consistently used these terms. Other groups have mostly used the parent (GO:0003700 | DNA-binding transcription factor activity).

I am happy to keep the terms if needed, but then it's be nice to have consistent usage.

Let me know what you think.

Thanks, Pascale

pgaudet commented 6 years ago

Here are the EXPERIMENTAL annotations by group:

AgBase

GO ID Term label  Number of EXP annotations
GO:0003700 DNA-binding transcription factor activity   4

CAFA

GO ID Term label Number of EXP annotations
GO:0001217 bacterial-type RNA polymerase transcriptional repressor activity, sequence-specific DNA binding   2
GO:0003700 DNA-binding transcription factor activity   1
GO:0001140 transcriptional activator activity, bacterial-type RNA polymerase proximal promoter sequence-specific DNA binding 2
GO:0001141 transcriptional repressor activity, bacterial-type RNA polymerase proximal promoter sequence-specific DNA binding 1

EcoCyc

GO ID Term label Number of EXP annotations
GO:0001130 bacterial-type RNA polymerase transcription factor activity, sequence-specific DNA binding   1
GO:0001216 bacterial-type RNA polymerase transcriptional activator activity, sequence-specific DNA binding   3
GO:0003700 DNA-binding transcription factor activity   65
GO:0098531 ligand-activated transcription factor activity   1
GO:0001131 transcription factor activity, bacterial-type RNA polymerase proximal promoter sequence-specific DNA binding   5
GO:0001151 transcription factor activity, bacterial-type RNA polymerase transcription enhancer sequence-specific binding   2
GO:0001140 transcriptional activator activity, bacterial-type RNA polymerase proximal promoter sequence-specific DNA binding 7
GO:0001141 transcriptional repressor activity, bacterial-type RNA polymerase proximal promoter sequence-specific DNA binding 6

PseudoCAP

GO ID Term label Number of EXP annotations
GO:0001216 bacterial-type RNA polymerase transcriptional activator activity, sequence-specific DNA binding   2
GO:0001217 bacterial-type RNA polymerase transcriptional repressor activity, sequence-specific DNA binding   2
GO:0003700 DNA-binding transcription factor activity   2

UniProt

GO ID Term label Number of EXP annotations
GO:0001130 bacterial-type RNA polymerase transcription factor activity, sequence-specific DNA binding   2
GO:0001216 bacterial-type RNA polymerase transcriptional activator activity, sequence-specific DNA binding 2
GO:0003700 DNA-binding transcription factor activity 13

Others

GO ID Term label   Number of EXP annotations Group
GO:0003700 DNA-binding transcription factor activity   1 EcoliWiki
GO:0003700 DNA-binding transcription factor activity   5 JCVI
GO:0001130 bacterial-type RNA polymerase transcription factor activity, sequence-specific DNA binding   1 MTBBASE
GO:0003700 DNA-binding transcription factor activity   7 MTBBASE
GO:0003700 DNA-binding transcription factor activity   1 TIGR
pgaudet commented 5 years ago

If there are no objections I'll merge the terms.

Pascale

pgaudet commented 5 years ago

Here's what I did:

And fixed definitions accordingly.


These changes have no impact on existing annotations.

pgaudet commented 5 years ago

Hello @ivanerill Since your group is the only one who used used those terms consistently, I wanted to make sure you are OK with the merge before I complete it.

Please let me know.

Thanks, Pascale

ivanerill commented 5 years ago

Hi @pgaudet ,

Sorry, I've been swamped with other things. I have no issues with generalizing the bacterial terms. I will adapt the scripts that generated GO annotations from CollecTF curated records so that they map to the new terms.

Thanks!

Ivan

pgaudet commented 5 years ago

Email discussion with Julio Collado and his student Citlalli Mejia Almonte

Hello Pascale,

I was thinking the same thing that Julio did.

I agree that bacterial terms can be merged with the general ones. I'm working on a paper of definitions of concepts of gene regulation in prokaryotes and, so far, I have arrived to the conclusion that a transcription factor is a molecule that has role the DNA-binding transcription factor activity. Using the general term, no the bacterial specific one. And I think the same applies for the other terms.

I was just reading some reviews on eukaryotic RNA polymerases trying to answer to myself the same doubt of Julio regarding the specific terms defined in terms of the kind of polymerase that bears those functions. In my quick revision I couldn't find a difference in such molecular activities, though.

For example, the RNA-polymerase activity is defined as the catalysis of the diphosphate bonds, and all of the subclases have the same kind of catalytic activity, but they are realized by different kinds of molecules. Conceptually, I think kinds of molecular activities are defined by differences in some quality of the molecular activity, just like the immediate subclasses of the term (3'-5' RNA polymerase activity and 5'-3' polymerase activity), not because the bearers are different.

Moreover, as you said, the differences I found on transcription activities of different eukaryotic polymerases was the location of the activity, but shouldn't this information be represented by using Cellular Component terms?

Another difference I could found was the kind of RNA produced by the different polymerases. I think this could be a good reason to maintain theses classes if and only if the kind of RNA produced by each polymerase is the only kind of RNA produced by that polymerase and it is not produced by any other kind of polymerase. In that case, this information should be included in respective definitions.

Finally, I understand that the correctness of the ontology lies in its usefulness. So I think that if having such different kinds of eukaryotic transcription terms has been prooved to be useful, then it's ok. These are just my insights.

Best regards, Citlalli


Dear Pascale,

I have not had the time to discuss in person with Citlali. Reading your recent email, it sounds fine to me the 3 merging that you propose. Although if you will keep in the example 3, as children of "transcription" the terms "transcription from Pol-II" and from Pol-III, why would you disappear ONLY the one specifying "bacterial transcription" Or are you disappearing also those of Pol-II and Pol-III. The same qiuestions apply for your example 2.

For instance, I wonder how or where is it annotated the type of sigma factor within transcription activity, and if this is could be a son of "transcription" without specifying "bacterial".

On your plans for 2019, I am particulary interested in Biological Processes !

Let us see what Citlali suggests,

best,

Julio


On Thu, Dec 6, 2018 at 1:42 AM Pascale Gaudet pascale.gaudet@sib.swiss wrote: Dear Julio,

Thank you for your quick reply. I have highlighted in bold the terms I would like to merge into their respective parent:

Example 1

Example 2

Example 3

We have done quite a bit of reorganization in this area - in the Molecular Function branch. We plan to continue on Biological Process and Cellular Component in 2019. It’d be great if you and/or Citlali could contribute.

Best regards, Pascale


On 6 Dec 2018, at 06:14, Julio Collado colladojulio@gmail.com wrote:

Dear Pascale,

Good to hear from you !

I am happy to see you are working on this ! I dont quite get which 3 terms you refer among the many mentioned here ! Can you clarify more?

I think it makes sense to distinguish RNA polymerase activity from - DNA-binding transcription factor activity

Each of them can be still distinguished if you consider binding, from "initiating transcription", as well as from a TF binding vs affecting promoter activity.

I am precisely working with a PhD student revising several esential concepts of microbial gene regulation with the aim of discussing the different uses, and even proposing which are well defined. This would be a first step towards ontological definitions. I am here copying Citlali

Best regards,

Julio


On Wed, Dec 5, 2018 at 7:36 AM Pascale Gaudet pascale.gaudet@sib.swiss wrote: Dear Julio,

I hope you are doing well. I am contacting you because the Gene Ontology is restructuring GO terms related to transcription. We would like to input from experts in bacterial transcription. GO has terms specifically describing bacterial enzymes and bacterial processes - for example:

Example 1

Example 2

Example 3

In eukaryotes, it seems useful to make the disctinction, since there are many different types of transcription going on. However in prokaryotes that seems less relevant, if at all. Supprting this, the bacterial versions of the terms have not been used very much (or inconsistently, both across and within resources; many groups annotate bacteria to the general parent term).

My question is: can you think if a reason to keep the bacterial version of the 3 terms mentioned here?

Thank you in advance for your help.

Best regards, Pascale