Open pgaudet opened 6 years ago
Most are in Sequence Ontology; @tonysawfordebi will add SO soon in P2GO
Next:
SO terms relevant to the following: GO:0000980 RNA polymerase II distal enhancer sequence-specific DNA binding (occurs_at SO_0000165 (enhancer)) GO:0001162 RNA polymerase II intronic transcription regulatory region sequence-specific DNA binding (occurs_at SO_0000188 (intron)) GO:0000978 RNA polymerase II proximal promoter sequence-specific DNA binding (occurs_at SO_0001952 (promoter_flanking_region))
GO:0070888 E-box binding (occurs_at SO_0001158 (E_box_motif), occurs_at SO_0001952 (promoter_flanking_region)) GO:0044323 retinoic acid-responsive element binding (occurs_at SO_0001653 (retinoic_acid_responsive_element) [probably also], occurs_at SO_0001952 (promoter_flanking_region)) GO:0035538 carbohydrate response element binding (to be looked into as no obvious SO ID [probably also], occurs_at SO_0001952 (promoter_flanking_region)) GO:0070644 vitamin D response element binding (to be looked into as no obvious SO ID [probably also], occurs_at SO_0001952 (promoter_flanking_region)) GO:0071820 N-box binding (to be looked into as no obvious SO ID [probably also], occurs_at SO_0001952 (promoter_flanking_region)) GO:0070594 juvenile hormone response element binding (to be looked into as no obvious SO ID [probably also], occurs_at SO_0001952 (promoter_flanking_region)) GO:0044377 RNA polymerase II proximal promoter sequence-specific DNA binding, bending (occurs_at SO_0001952 (promoter_flanking_region)) GO:0035497 cAMP response element binding (to be looked into as no obvious SO ID [probably also], occurs_at SO_0001952 (promoter_flanking_region)) GO:0034056 estrogen response element binding (to be looked into as no obvious SO ID [probably also], occurs_at SO_0001952 (promoter_flanking_region)) GO:0032810 sterol response element binding (occurs_at SO_0001861 (sterol_regulatory_element), occurs_at SO_0001952 (promoter_flanking_region)) GO:0010736 serum response element binding (to be looked into as no obvious SO ID [probably also], occurs_at SO_0001952 (promoter_flanking_region))
@RLovering Why not use 'SO:0000235 TF_binding_site': A region of a nucleotide molecule that binds a Transcription Factor or Transcription Factor complex [GO:0005667].
rather than 'SO_0001952 promoter_flanking_region': A region immediately adjacent to a promoter which may or may not contain transcription factor binding sites.
Thanks, Pascale
Tagging @alexsign so he's aware of this thread too...
Hello @tonysawfordebi @alexsign @RLovering What is the status of this ?
Hi Pascale
I hadn't appreciated that 'SO:0000235 TF_binding_site' is available. Are you suggesting we just use that for all locations; ie promoters and enhancers?
Ruth
Yes - these sites can be both in promoters and enhances. So you could capture both: enhancer binding and TF_binding_site.
ok so the suggestion is to add both IDs, I guess it would be too much to request SO terms that combines these? ie TF_binding_site in promoter_flanking_region.
I am not sure if Tony emailed all the groups asking if they would be happy to have the AE equivalent terms added to their annotations before the merge takes place.
Any thoughts Tony (!!) or Alex?
Ruth
I am revising the suggestions in the long list above, but it occurs to me that really SO:0000235 TF_binding_site should be a parent to the following terms and then the addition of this ID would not be necessary to all these.
I will put in the request and not add it to those above which should have SO:0000235 TF_binding_site as a parent
Ruth
I haven't sent any emails about this proposal, only the one from https://github.com/geneontology/go-ontology/issues/16152
Good, looking at this again I don't think this can be done until there is a review of the SO ontology structure. I am not convinced that all the motifs listed under 'SO:0000167: promoter' should be listed here eg SO_0001653 (retinoic_acid_responsive_element) and if they are in the promoter why aren't they child terms of SO:0000170 RNApol_II_promoter.
Currently the 3 SO IDs and their parent terms that we would use are: SO:0000170 RNApol_II_promoter
is_a child SO_0001158 (E_box_motif)
SO:0000167: promoter
is_a child SO_0001653 (retinoic_acid_responsive_element)
SO:0001659 promoter element
is a child SO_0001861 (sterol_regulatory_element)
Maybe this is all correct but it looks very odd to me. Maybe the RNA pol promoter can't be used for the elements that exist in bacteria and this is why there is a difference, but is that the case for all of these? Also are these motifs and elements really all located in the promoter? The SO definition of the promoter is: A regulatory_region composed of the TSS(s) and binding sites for TF_complexes of the basal transcription machinery. Which is fine, but if this is the case then what is the difference between a binding site in the promoter and a binding site in the SO:0001952 promoter_flanking_region, definition: A region immediately adjacent to a promoter which may or may not contain transcription factor binding sites. And how is this different from SO:0000165 enhancer, definition: A cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter.
Plus The majority of the SO terms we would want do not exist. (see list above) So I would like these motif/element terms to be available before we undertake the merge.
To be honest I think we need to have a discussion with dbTF experts and work out what the ontology should be, before we merge all these terms.
Ruth
thanks Ruth
I haven't been following this thread closely, but in the classic molecular biology world the promoter was the place where RNA polymerase bound. I think that definition has gotten a bit more general in the literature.
@RLovering We'd be happy to coordinate the structuring of SO/MSO classes needed for these GO binding classes; just let us know when you're ready.
Terms impacted:
'E-box binding' 'N-box binding' 'RNA polymerase I CORE element sequence-specific DNA binding' 'RNA polymerase I enhancer sequence-specific DNA binding' 'RNA polymerase I upstream control element sequence-specific DNA binding' 'RNA polymerase II core promoter sequence-specific DNA binding' 'RNA polymerase II distal enhancer sequence-specific DNA binding' 'RNA polymerase II intronic transcription regulatory region sequence-specific DNA binding' 'RNA polymerase II proximal promoter sequence-specific DNA binding' 'RNA polymerase II proximal promoter sequence-specific DNA binding, bending' 'RNA polymerase II regulatory region sequence-specific DNA binding' 'RNA polymerase III hybrid type promoter sequence-specific DNA binding' 'RNA polymerase III type 1 promoter sequence-specific DNA binding' 'RNA polymerase III type 2 promoter sequence-specific DNA binding' 'RNA polymerase III type 3 promoter sequence-specific DNA binding' 'bacterial-type RNA polymerase core promoter sequence-specific DNA binding' 'bacterial-type RNA polymerase enhancer sequence-specific DNA binding' 'bacterial-type RNA polymerase regulatory region sequence-specific DNA binding' 'bacterial-type RNA polymerase termination site sequence-specific DNA binding' 'bacterial-type proximal promoter sequence-specific DNA binding' 'cAMP response element binding' 'carbohydrate response element binding' 'core promoter sequence-specific DNA binding' 'enhancer sequence-specific DNA binding' 'estrogen response element binding' 'juvenile hormone response element binding' 'mitochondrial RNA polymerase termination site sequence-specific DNA binding' 'polymerase III regulatory region sequence-specific DNA binding' 'proximal promoter sequence-specific DNA binding' 'retinoic acid-responsive element binding' 'serum response element binding' 'sterol response element binding' 'transcription termination site sequence-specific DNA binding' 'vitamin D response element binding'