geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
219 stars 40 forks source link

Merge 'GO:0000976 transcription regulatory region sequence-specific DNA binding' and children into corresponding terms in the 'transcription regulator activity' branch of the ontology #16130

Open pgaudet opened 6 years ago

pgaudet commented 6 years ago

Terms impacted:

'E-box binding' 'N-box binding' 'RNA polymerase I CORE element sequence-specific DNA binding' 'RNA polymerase I enhancer sequence-specific DNA binding' 'RNA polymerase I upstream control element sequence-specific DNA binding' 'RNA polymerase II core promoter sequence-specific DNA binding' 'RNA polymerase II distal enhancer sequence-specific DNA binding' 'RNA polymerase II intronic transcription regulatory region sequence-specific DNA binding' 'RNA polymerase II proximal promoter sequence-specific DNA binding' 'RNA polymerase II proximal promoter sequence-specific DNA binding, bending' 'RNA polymerase II regulatory region sequence-specific DNA binding' 'RNA polymerase III hybrid type promoter sequence-specific DNA binding' 'RNA polymerase III type 1 promoter sequence-specific DNA binding' 'RNA polymerase III type 2 promoter sequence-specific DNA binding' 'RNA polymerase III type 3 promoter sequence-specific DNA binding' 'bacterial-type RNA polymerase core promoter sequence-specific DNA binding' 'bacterial-type RNA polymerase enhancer sequence-specific DNA binding' 'bacterial-type RNA polymerase regulatory region sequence-specific DNA binding' 'bacterial-type RNA polymerase termination site sequence-specific DNA binding' 'bacterial-type proximal promoter sequence-specific DNA binding' 'cAMP response element binding' 'carbohydrate response element binding' 'core promoter sequence-specific DNA binding' 'enhancer sequence-specific DNA binding' 'estrogen response element binding' 'juvenile hormone response element binding' 'mitochondrial RNA polymerase termination site sequence-specific DNA binding' 'polymerase III regulatory region sequence-specific DNA binding' 'proximal promoter sequence-specific DNA binding' 'retinoic acid-responsive element binding' 'serum response element binding' 'sterol response element binding' 'transcription termination site sequence-specific DNA binding' 'vitamin D response element binding'

pgaudet commented 6 years ago

Most are in Sequence Ontology; @tonysawfordebi will add SO soon in P2GO

Next:

RLovering commented 6 years ago

SO terms relevant to the following: GO:0000980 RNA polymerase II distal enhancer sequence-specific DNA binding (occurs_at SO_0000165 (enhancer)) GO:0001162 RNA polymerase II intronic transcription regulatory region sequence-specific DNA binding (occurs_at SO_0000188 (intron)) GO:0000978 RNA polymerase II proximal promoter sequence-specific DNA binding (occurs_at SO_0001952 (promoter_flanking_region))

GO:0070888 E-box binding (occurs_at SO_0001158 (E_box_motif), occurs_at SO_0001952 (promoter_flanking_region)) GO:0044323 retinoic acid-responsive element binding (occurs_at SO_0001653 (retinoic_acid_responsive_element) [probably also], occurs_at SO_0001952 (promoter_flanking_region)) GO:0035538 carbohydrate response element binding (to be looked into as no obvious SO ID [probably also], occurs_at SO_0001952 (promoter_flanking_region)) GO:0070644 vitamin D response element binding (to be looked into as no obvious SO ID [probably also], occurs_at SO_0001952 (promoter_flanking_region)) GO:0071820 N-box binding (to be looked into as no obvious SO ID [probably also], occurs_at SO_0001952 (promoter_flanking_region)) GO:0070594 juvenile hormone response element binding (to be looked into as no obvious SO ID [probably also], occurs_at SO_0001952 (promoter_flanking_region)) GO:0044377 RNA polymerase II proximal promoter sequence-specific DNA binding, bending (occurs_at SO_0001952 (promoter_flanking_region)) GO:0035497 cAMP response element binding (to be looked into as no obvious SO ID [probably also], occurs_at SO_0001952 (promoter_flanking_region)) GO:0034056 estrogen response element binding (to be looked into as no obvious SO ID [probably also], occurs_at SO_0001952 (promoter_flanking_region)) GO:0032810 sterol response element binding (occurs_at SO_0001861 (sterol_regulatory_element), occurs_at SO_0001952 (promoter_flanking_region)) GO:0010736 serum response element binding (to be looked into as no obvious SO ID [probably also], occurs_at SO_0001952 (promoter_flanking_region))

pgaudet commented 6 years ago

@RLovering Why not use 'SO:0000235 TF_binding_site': A region of a nucleotide molecule that binds a Transcription Factor or Transcription Factor complex [GO:0005667].

rather than 'SO_0001952 promoter_flanking_region': A region immediately adjacent to a promoter which may or may not contain transcription factor binding sites.

Thanks, Pascale

tonysawfordebi commented 6 years ago

Tagging @alexsign so he's aware of this thread too...

pgaudet commented 5 years ago

Hello @tonysawfordebi @alexsign @RLovering What is the status of this ?

RLovering commented 5 years ago

Hi Pascale

I hadn't appreciated that 'SO:0000235 TF_binding_site' is available. Are you suggesting we just use that for all locations; ie promoters and enhancers?

Ruth

pgaudet commented 5 years ago

Yes - these sites can be both in promoters and enhances. So you could capture both: enhancer binding and TF_binding_site.

RLovering commented 5 years ago

ok so the suggestion is to add both IDs, I guess it would be too much to request SO terms that combines these? ie TF_binding_site in promoter_flanking_region.

I am not sure if Tony emailed all the groups asking if they would be happy to have the AE equivalent terms added to their annotations before the merge takes place.

Any thoughts Tony (!!) or Alex?

Ruth

RLovering commented 5 years ago

I am revising the suggestions in the long list above, but it occurs to me that really SO:0000235 TF_binding_site should be a parent to the following terms and then the addition of this ID would not be necessary to all these.

I will put in the request and not add it to those above which should have SO:0000235 TF_binding_site as a parent

Ruth

tonysawfordebi commented 5 years ago

I haven't sent any emails about this proposal, only the one from https://github.com/geneontology/go-ontology/issues/16152

RLovering commented 5 years ago

Good, looking at this again I don't think this can be done until there is a review of the SO ontology structure. I am not convinced that all the motifs listed under 'SO:0000167: promoter' should be listed here eg SO_0001653 (retinoic_acid_responsive_element) and if they are in the promoter why aren't they child terms of SO:0000170 RNApol_II_promoter.

Currently the 3 SO IDs and their parent terms that we would use are: SO:0000170 RNApol_II_promoter

is_a child SO_0001158 (E_box_motif)

SO:0000167: promoter

is_a child SO_0001653 (retinoic_acid_responsive_element)

SO:0001659 promoter element

is a child SO_0001861 (sterol_regulatory_element)

Maybe this is all correct but it looks very odd to me. Maybe the RNA pol promoter can't be used for the elements that exist in bacteria and this is why there is a difference, but is that the case for all of these? Also are these motifs and elements really all located in the promoter? The SO definition of the promoter is: A regulatory_region composed of the TSS(s) and binding sites for TF_complexes of the basal transcription machinery. Which is fine, but if this is the case then what is the difference between a binding site in the promoter and a binding site in the SO:0001952 promoter_flanking_region, definition: A region immediately adjacent to a promoter which may or may not contain transcription factor binding sites. And how is this different from SO:0000165 enhancer, definition: A cis-acting sequence that increases the utilization of (some) eukaryotic promoters, and can function in either orientation and in any location (upstream or downstream) relative to the promoter.

Plus The majority of the SO terms we would want do not exist. (see list above) So I would like these motif/element terms to be available before we undertake the merge.

To be honest I think we need to have a discussion with dbTF experts and work out what the ontology should be, before we merge all these terms.

Ruth

pgaudet commented 5 years ago

thanks Ruth

ukemi commented 5 years ago

I haven't been following this thread closely, but in the classic molecular biology world the promoter was the place where RNA polymerase bound. I think that definition has gotten a bit more general in the literature.

mikebada commented 5 years ago

@RLovering We'd be happy to coordinate the structuring of SO/MSO classes needed for these GO binding classes; just let us know when you're ready.