geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
220 stars 40 forks source link

general query about "factor " terms #13536

Closed ValWood closed 6 years ago

ValWood commented 7 years ago

I notice that GO:0003716 obsolete RNA polymerase I transcription termination factor activity is obsoleted with the comment

This term is obsolete. This term was obsoleted because it is essentially identical to a Process term (specifically the Biological Process term which has been selected as a term to consider for reannotation), i.e. it is defined only in terms of the process it acts in and it does NOT convey any information about the molecular nature of the function or whether the function is based on binding DNA, on interacting with other proteins, or some other mechanism. To transfer all annotations without review, the BP term indicated is considered to be equivalent and thus the only appropriate destination for all annotations. To reannotate to a MF term, you will probably need to revisit the original literature or other primary data because this "MF" term was not defined in terms of mechanism of action and there are multiple possibilities in the revised MF structure. In reannotation, please also consider descendent terms of the suggested MF terms as a more specific term may be more appropriate than the MF terms indicated. Please be aware that you may wish to request a new term if the mechanism of action of this gene product is not yet represented or if you are annotating for an RNAP different than one for which there is a specific suggested term. Also note that if there is no information about how the gene product acts, it may be appropriate to annotate to the root term for molecular_function.

but we do not apply this rule consistently because we still have terms

GO:0003746 Name translation elongation factor activity Definition Functions in chain elongation during polypeptide synthesis at the ribosome.

GO:0003743 translation initiation factor activity Functions in the initiation of ribosome-mediated translation of mRNA into a polypeptide

transcription factor activity, protein binding nteracting selectively and non-covalently with any protein or protein complex (a complex of two or more proteins that may include other nonprotein molecules), in order to modulate transcription. A protein binding transcription factor may or may not also interact with the template nucleic acid (either DNA or RNA) as well.

transcription factor activity , RNA polymerase II core promoter sequence-specific binding involved in preinitiation complex assembly

GO:0045183 translation factor activity, non-nucleic acid binding

etc, etc

which are exactly the same, a "factor" involved in a process

pgaudet commented 7 years ago

Hi Val,

I agree that the 'factor' argument is not completely convincing.

Do you suggest we merge those MF into their respective BP ? for example 'translation initiator factor activity -> translation initiation', etc (although I am not sure we can do a Function/process merge).

Thanks, Pascale

ValWood commented 7 years ago

Actually a "factor" term which is precisely positioned as an "RNA binding" term for example, might be OK. In this case it might be representing more accurately how the community refer to these functions. Maybe it is better to leave it be for the time being and just deal with the ones which do not say anything about the molecular function at all?

krchristie commented 7 years ago

The lack of a specific, well-defined GO molecular function, i.e. the inability to define the MF term for "transcription factor activity" more specifically than somehow involved in the BP of "transcription" was exactly why some of the transcription factor terms were obsoleted. The idea at the time was that the MF term should include some description of HOW the BP is accomplished. It was also hoped that this would satisfy a complaint I received from a researcher at a transcription meeting that she could never remember whether she should use the MF term or the BP term for enrichment analyses since they seemed indistinguishable.

However, there was never any follow through to treat other preexisting "factor" terms similarly and there has been discussion about bringing "transcription factor activity" back despite the fact that it can not be defined in a way that distinguishes it from the process term...

pgaudet commented 7 years ago

Hi @krchristie I think the main issue with transcription is that the process is way overused, in such a way that it's not possible to get all transcription factors by using the BP terms. (That being said, I think the ontology is designed to allow this - just the annotations are not strict enough for this).

What do you think ?

Pascale

krchristie commented 7 years ago

Hi @pgaudet

I think that the phrase "transcription factor" is a process grouping term, much the same way that @ValWood just described the current MF term "signal transducer activity":

https://github.com/geneontology/go-ontology/issues/14232:

I don't remember the exact history of "signal transducer activity" but if it covers receptors, kinases, GTPases etc, it isn't really a MF grouping term at all it's a collection of different functions =BP term in the MF ontology. Why do we need it ?

David H and I spent quite some time researching the usage of "transcription factor" in the literature and came to the conclusion that it is really quite loose, historically having been used for many chromatin remodeling factors that affect transcription as well as for the DNA binding transcription factors that are the only/main thing many people think of when they use this phrase. In addition, this phrase definitely still does include the general transcription factors (GTFs) for RNAP II, many of which do NOT bind DNA. The MF of some of these GTFs might be best described as "protein binding, bridging". So the only thing that ties all of these different things that have been, or still are, referred to as "transcription factors" is involvement in transcription, somehow. This brings me back to Val's comment:

it isn't really a MF grouping term at all it's a collection of different functions =BP term in the MF ontology

Personally, I'm not sure that the BP term "transcription, DNA-dependent" should be expected to ONLY have transcription factors map up to it. Trivially, the RNA polymerases should be annotated under this term and are not transcription factors. What things are you seeing that you think are inappropriate?

ValWood commented 6 years ago

close?