The-Sequence-Ontology / SO-Ontologies

Collect of SO Ontologies
Creative Commons Attribution 4.0 International
94 stars 37 forks source link

promoter definition and child term revisions #457

Closed RLovering closed 5 years ago

RLovering commented 5 years ago

Hi please can the definition for this term be revised:

SO:0000167 | promoter | New definition: A regulatory_region composed of the Transcription Start Site (TSS) and binding sites for TF_complexes of the basal transcription machinery. | I have suggested adding 'Transcription Start Site' to provide the full name for abbreviations and have suggested removing 'for TF_complexes of' to make the end of the sentence clearer.

Please remove the child term retinoic_acid_responsive_element (SO:0001653) as RARE is bound by dbTFs (RAR); not bound by basal transcription machinery

I have asked some experts to comment on whether the promoter definition should imply the promoter may contain more than 1 TSS, or whether each TSS constitute a 'promoter' So that in theory promoters could be overlapping.

Thanks

Ruth

RLovering commented 5 years ago

Comments from Colin Logie

SO or MSO IDs We can do with Cis-regulatory module for the DbTF binding site and transcription start sites (TSS) and/or promoter for the transcription initiation site of the responding gene. These terms are like Russian dolls, (Gene has (one or more) promoter which must have one or more TSS)

RLovering commented 5 years ago

comments from Philipp Bucher

Hi all,

Here is another suggestion for a promoter definition:

A regulatory_region including the Transcription Start Site (TSS) of a gene and serving as a platform for Pre-Initiation Complex (PIC) assembly.

The addition "of a gene" serves to exclude transcribed enhancers which otherwise would be included in the definition.

Additional comments:

RLovering commented 5 years ago

Comments from Colin Logie in response to Philipp promoter definition above

P E R F E C T I like the addition of 'gene' to the definition. The same was achieved by stipulating that a TSS is the start site for an mRNA (which codes for a protein is therefore part of a gene).

RLovering commented 5 years ago

Sent to group 5 Feb 2019: The promoter definition proposed by Philipp is for eukaryotes not for bacteria. SO does have eukaryotic and bacterial SO terms eg: SO:0000614 bacterial_terminator and SO:0000951 eukaryotic_terminator, so I think this would be possible.

If we use the definition as proposed by Philipp this should be for a new child term to 'promoter' (eukaryotic_promoter), and we may need to create the new child term ‘bacterial_promoter’. However, if we create this distinction then we will have to make sure that any child term that is present in both eukaryotes and bacteria only have the species neutral parent. I think we should see how useful it would be to have these eukaryotic and bacterial child terms.

We still need to create the species neutral definition for promoter.

Based on Philipp's definition could we have: A regulatory_region including the Transcription Start Site (TSS) of a gene and, in eukaryotes, serving as a platform for Pre-Initiation Complex (PIC) assembly. This is the region is recognized and bound by an RNA polymerase.

RLovering commented 5 years ago

summary of email in response to the above to 13 Feb

Astrid:

Your suggestion looks good to me

Val:

These are the open tracker tickets that mention "promoter". Some of these might now be out of date so I would look at Ruth's newer ones first. https://github.com/The-Sequence-Ontology/SO-Ontologies/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+promoter

Ruth:

SO:0000167 promoter. Definition: A regulatory_region including the Transcription Start Site (TSS) of a gene and, in eukaryotes, serving as a platform for Pre-Initiation Complex (PIC) assembly. This is the region recognized and bound by an RNA polymerase. 2 new is_a child terms eukaryotic promoter Definition: A regulatory_region including the Transcription Start Site (TSS) of a gene and serving as a platform for Pre-Initiation Complex (PIC) assembly. This is the region recognized and bound by an RNA polymerase. Bacterial promoter Definition: A regulatory_region including the Transcription Start Site (TSS) of a gene. This is the region is recognized and bound by an RNA polymerase.

Or can anyone propose any improvements to the bacterial definition?

Please confirm which proposal you prefer: 3 promoter terms (as listed above) or 1 promoter term (to include all child terms for both eukaryotic and bacterial promoters, ie no eukaryotic and no bacterial specific promoter term).

Karen:

I think for me, the most logical solution is to have a generic promoter term that can be the holder of the parts that are common to both. Then we can have specialized bacterial and eukaryotic promoters that hold the parts that are particular to them. It means more work, but would stop confusion.

RLovering commented 5 years ago

Hi Karen If you don’t mind doing this that would be great. If the bacterial promoter ends up with only 1 child term we might want to review whether it is needed, but it doesn’t sound like this is going to be a problem.

So unless anyone objects it sounds like this is the first decision made, although I would like to see the bacterial promoter definition improved, but I think it works as a start.

Ruth

RLovering commented 5 years ago

emails 13-14 Feb 2019 Citlalli:

I almost arrive to the conclusion that a species-independent promoter definition can be proposed. However, I still have a doubt regarding eukaryotic biology. In bacteria and archaea, the promoter must be bound by the RNAP-holoenzyme. My question is if RNAP-holoenzyme are equivalent concepts in both, eukaryotes and prokaryotes. In prokaryotes, I define the holoenzyme as the complex of proteins that have three molecular capabilities: promoter recognition, promoter melting and phosphodiester bond catalytic activity. So, the question is if eukaryotic RNAP-holoenzyme is a protein complex that can recognize the promoter, open de double helix, and catalyze bond formation.

Philipp:

If we keep the general definition, it should probably be modified as follows:

SO:0000167 promoter. Definition: A regulatory_region including the Transcription Start Site (TSS) of a gene and, in eukaryotes, serving as a platform for Pre-Initiation Complex (PIC) assembly, in bacteria, being the region recognized and bound by an RNA polymerase.

It is highly problematic to say that RNA polymerase recognizes a eukaryotic promoter region because the word "recognize" suggests an active role in the promoter selection process, a role that is commonly attributed to basal transcription factors in eukaryotic. A more general definition could be: "A regulatory region enabling transcription of a gene under certain conditions and including the transcription start site"

RLovering commented 5 years ago

sent 14 Feb 2019 Based on Citlalli's comments I have replaced 'bacteria' with 'prokaryotes' will these definitions be correct for all prokaryotes? I am putting these emails on the github ticket so that these are recorded somewhere where people can track the discussion without having to find the emails.

https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/457

I am fairly sure that the promoter region we are defining will not include the region bound by a 'DNA binding transcription factor' and therefore I prefer the more specific definition Philipp suggested, but I have added a statement from his more general definition 'enabling transcription of a gene under certain conditions' SO:0000167 promoter. Definition: A regulatory_region including the Transcription Start Site (TSS) of a gene enabling transcription of a gene under certain conditions. In eukaryotes this region serves as a platform for Pre-Initiation Complex (PIC) assembly; in prokaryotes, this region is recognized and bound by an RNA polymerase.

I have put the various versions of this definitions in the google spreadsheet and put this current suggested definition in the 'current definition' column: https://docs.google.com/spreadsheets/d/1ThtI8cmDAThDHNKMANgf9yhAba1ALMlz8Sqm6dXpqZo/edit?usp=sharing

Based on Philipp and Citlallis comments would the following child term definitions be better?:

Term: eukaryotic promoter Definition: A regulatory_region including the Transcription Start Site (TSS) of a gene and serving as a platform for Pre-Initiation Complex (PIC) assembly, enabling transcription of a gene under certain conditions.

Term: prokaryotic promoter Definition: A regulatory_region including the Transcription Start Site (TSS) of a gene. This is the region is recognized and bound by the RNA polymerase(RNAP)-holoenzyme, enabling transcription of a gene under certain conditions.

I am not an expert in this area, please suggest modifications to these definitions, or confirm that these are acceptable.

Thanks

Ruth

RLovering commented 5 years ago

Note discussion on enhancers https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/461

RLovering commented 5 years ago

I have discussed this with Pascale and others and want to point out that the promoter definition as it stands is more like the core promoter definition.

It seems that many dbTFs bind specific sites that are located in the proximal promoter but can also be found in the enhancer. It would therefore be useful to put all the specific dbTF sites under the term TF_binding_site rather than as child terms to either enhancer or promoter.

see new ticket https://github.com/The-Sequence-Ontology/SO-Ontologies/issues/466

RLovering commented 5 years ago

All of the exisiting RNApol_II_core_promoter child terms look correct (see below), are there any sites/motifs that are missing from this list?

RLovering commented 5 years ago

From: DAVID W SANT Hi Ruth,

I apologize that it has taken me so long to get back to this. I have not run all of these ideas by my group, but here is some information about how I propose we fix some of the issues in SO.

You have suggested that we create a core eukaryotic promoter and a core prokaryotic promoter. What we currently have for “core_promoter_element” fits eukaryotic promoter. I think we can simply move all of the child terms there into a new term such as “eukaryotic_core_promoter_element”. The current child terms are: BREu_motif, DCE, homol_D_box, BREd_motif, TATA_box, INR_motif, A_box, TCT_motif, intermediate_element, B_box, MTE, AACCCT_box, C_box, DPE_motif, GATA_box.

I have tried to search through publications and current SO definitions to determine what should be included in “prokaryotic_core_promoter”. The only terms that I think should be included are minus_10_signal (which has Pribnow box as a synonym) and minus_35_signal. From what I have found in publications, these are ubiquitous across prokaryotes. Does this sound correct? The only other ones that might be worth including would be minus_12_signal and minus_24_signal, which are children of the term “bacterial_RNA_polymerase_promoter_sigma54”.

What is your take on this? Has there been any more discussion of this among the GREEKC fellows?

davidwsant commented 5 years ago

Hi Ruth,

The promoter terms have been updated with the conclusions reached by the group.

RLovering commented 5 years ago

this is great news, thanks for your help with this and all of the other tickets Ruth