geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
220 stars 40 forks source link

transcription factor terms #8719

Closed gocentral closed 9 years ago

gocentral commented 13 years ago

Hi

I am a bit concerned about the top level GO terms for transcription factor activity.

  1. GO:0000988 protein binding transcription factor activity The definition states:Interacting selectively and non-covalently with two or more protein molecules, or a protein and another macromolecule or complex However, the child term: GO:0000990 core RNA polymerase binding transcription factor activity has the definition: Interacting selectively and non-covalently with an RNA polymerase in order to modulate transcription.

This child defintion does not state 2 proteins are bound, nor does the definition for GO:0000989 transcription factor binding transcription factor activity, which seems even more specific: Interacting selectively and non-covalently with a specific transcription factor, which may be a single protein or a complex. It would be good if these definitions could be more consistant.

When we associate one of these terms to a gene, do we have to find evidence of the protein binding 2 other proteins or complexes?

  1. Why isn't GO:0000989 transcription factor binding transcription factor activity a child of protein binding?

  2. If the parent definition changes to be consistant with the child definitions then, as I think as pretty much all proteins interact with another protein, isn't protein binding in protein binding transcription factor activity redundant?

  3. From the statement in 3, why isn't GO:0003700 sequence-specific DNA binding transcription factor activity a child of protein binding transcription factor activity?

My concern here is that when annotating we may have evidence that the 'transcription factor' binds DNA and it regulates transcription but not have evidence that it binds another protein. Consequently, some transcription factors will be annotated to the GO:0001071 nucleic acid binding transcription factor activity branch of MF whereas others will be annotated to the protein binding transcription factor activity branch. To have 2 unrelated branches means that not all TFs are included in one MF GO term group. I would really like to see a single general TF term to ensure consistant annotation of all TFs.

Thanks

Ruth

Reported by: RLovering

Original Ticket: geneontology/ontology-requests/8507

gocentral commented 13 years ago

Hi Ruth,

I'm assigning to Karen, but for Q2:

This relationship is captured using HAS_PART: transcription factor binding transcription factor activity ; GO:0000989 HAS_PART 'transcription factor binding ; GO:0008134' (which is_a protein binding). So this is a more accurate way of capturing the protein-binding aspect of GO:0000989.

Becky

Original comment by: rebeccafoulger

gocentral commented 13 years ago

Original comment by: rebeccafoulger

gocentral commented 13 years ago

Hi

> I am a bit concerned about the top level GO terms for transcription > factor activity.

David and I just made some adjustments in the definition of the top-level term, changing the has_part relationship from "protein binding", to "protein binding, bridging" as this term is primarily intended to cover basal transcription factors, especially those that do not bind DNA, and cofactors, which are defined as mediating protein-protein contacts between the basal transcription machinery and other regulatory factors, e.g. activators or repressors. I have not yet propagated that change down to child terms but will be working on that this week.

> 1. GO:0000988 protein binding transcription factor activity > > The definition states:Interacting selectively and non-covalently with > two or more protein molecules, or a protein and another macromolecule > or complex > > However, the child term: > GO:0000990 core RNA polymerase binding transcription factor activity > > has the definition: Interacting selectively and non-covalently with an > RNA polymerase in order to modulate transcription. > > This child defintion does not state 2 proteins are bound, nor does the > definition for GO:0000989 transcription factor binding transcription > factor activity, which seems even more specific: Interacting > selectively and non-covalently with a specific transcription factor, > which may be a single protein or a complex. It would be good if these > definitions could be more consistant. > > When we associate one of these terms to a gene, do we have to find > evidence of the protein binding 2 other proteins or complexes? > > 2. Why isn't GO:0000989 transcription factor binding transcription > factor activity a child of protein binding?

because it has a has_part relationship to the appropriate binding term ("protein binding, bridging") as it is my understanding, as you brought up in response to the original proposal David and I sent out, that this is the decision that was made at the Geneva Annotation camp that binding aspects of functions should be indicated with teh has_part relationship,

> 3. If the parent definition changes to be consistant with the child > definitions then, as I think as pretty much all proteins interact with > another protein, isn't protein binding in protein binding > transcription factor activity redundant?

see my initial comment

> 4. From the statement in 3, why isn't GO:0003700 sequence-specific DNA > binding transcription factor activity a child of protein binding > transcription factor activity?

because there are things such as Integration Host Factor (IHF) that function as transcription factors by binding, and bending DNA, but which do not contact the basal transcription machinery/complex at all and thus do not act by "protein binding, bridging".

> My concern here is that when annotating we may have evidence that the > 'transcription factor' binds DNA and it regulates transcription but > not have evidence that it binds another protein. Consequently, some > transcription factors will be annotated to the GO:0001071 nucleic acid > binding transcription factor activity branch of MF whereas others will > be annotated to the protein binding transcription factor activity > branch. To have 2 unrelated branches means that not all TFs are > included in one MF GO term group. I would really like to see a single > general TF term to ensure consistant annotation of all TFs.

IT would be nice if researchers in the transcription field used the phrase "transcription factor" to refer to only one mechanism of functioning, but it appears that this phrase is used only to mean "involved in transcription", which is really only equivalent to a BP term representing transcription. While the majority of transcription factors that most people think about, e.g. regulatory transcription factors that bind a specific sequence to activate a relatively small subset of genes, do act by binding DNA, the phrase "transcription factor" is also used to refer to basal transcription factors, many of which DO NOT bind DNA. Thus to accomodate basal transcription factors, which are definitely called transcription factors in the literature, we need something and we can't put them under the "sequence specific DNA binding transcription factors" because it isn't universally true.

-Karen

Original comment by: krchristie

gocentral commented 13 years ago

Ok thanks for explaining this with some great examples too

Best

Ruth

Original comment by: RLovering

gocentral commented 13 years ago

OK great, I'm glad that helped, I'm going to close this item.

thanks,

-Karen

Original comment by: krchristie

gocentral commented 13 years ago

Original comment by: krchristie