geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
222 stars 40 forks source link

general and specific transcription #3651

Closed gocentral closed 9 years ago

gocentral commented 18 years ago

I need to be able to distinguish between specific and general transcription at the process level. I think GO should make a distinction between these (it doesn't currently). I am sure users would expect to be able to retrieve these sets separately. I can do the query but I need to get the transcription total and then subtract all of the complexes which contain the core transcriptional machinery which is really fiddly.

Reported by: ValWood

Original Ticket: "geneontology/ontology-requests/3666":https://sourceforge.net/p/geneontology/ontology-requests/3666

gocentral commented 17 years ago

Logged In: YES user_id=473890

Hi Val,

I would certainly agree with you that when I worked in a transcription lab, people made the distinction between general and specific transcription factors for RNA polymerase II. I don't think the distinction is used for any of the other polymerases though, so any incorporation of this into GO process would need to stay in the Pol II specific terms.

-Karen

Original comment by: krchristie

gocentral commented 17 years ago

Logged In: YES user_id=516865

Yes I agree it is pol II specific

Something like:

transcription from RNA polymerase II promoter --specific transcription from RNA pol II promoter --general transcription from RNA pol II promoter ----RNA elongation from RNA polymerase II promoter ----transcription initiation from RNA polymerase II promoter ----transcription termination from RNA polymerase II promote

it looks like many of the terms like regulation of transcription from RNA polymerase II promoter, mitotic should go under a 'specific transcription term'

It is really frustrating for microarray groups and other groups not to be able differntitate between core transcription and specific transcription so this should proably be a 'high' prirority SF item, if there is such a thing.

Original comment by: ValWood

gocentral commented 17 years ago

Logged In: YES user_id=436423

ok, priority up :)

Karen, do you want to implement this? If not, I can.

m

Original comment by: mah11

gocentral commented 17 years ago

Original comment by: mah11

gocentral commented 17 years ago

Logged In: YES user_id=473890

Hi Midori,

I certainly don't feel proprietary about who implements this, but I'd like to think a little more about whether initiation should be under general transcription. I have vague thoughts that some of the specific transcription factors may also affect initiation, so at the moment I'm not completely sure that we want to put initiation under general txn. and sorry, I won't really take a look at this today. I'm home sick, but just checked in quickly to see if there was anything pressing I should respond to.

-Karen

Original comment by: krchristie

gocentral commented 17 years ago

Logged In: YES user_id=516865

I have an additional suggestion,

we have a term GO:0042789 : mRNA transcription from RNA polymerase II promoter (this only has 9 annotations).

If there is no biological distinction between the core PolII machinery which transcrciribe mRNA and snRNAs I think this should be merged with the parent term transcription from RNA polymerase II promoter made a narrower than synonym of the parent term

it should also be added as a broader than synonym to the new term specific transcription from RNA pol II promoter

Karen would that work?

(it seems to me that there is no biological reason for having the mRNA transcription term and that the biological difference is between the 'core' machinery and the specific regulators and enhancers. Or have I overlooked something?)

Val

Original comment by: ValWood

gocentral commented 17 years ago

Logged In: YES user_id=436423

Karen - get well soon! I'm still healthy for the moment, but buried under administrivia, so we'll see which of us can pay attention to this first (I'll check with you off-SF and then assign to you or myself).

As for initiation, I can't think of any transcription that doesn't have to be initiated, so there should be an initiation term under the Pol II transcription parent, and general and specific terms as children, i.e. something like this (names negotiable!):

transcription from RNA polymerase II promoter --transcription initiation from RNA polymerase II promoter ----general transcription initiation from RNA polymerase II promoter ----specific transcription initiation from RNA polymerase II promoter --RNA elongation from RNA polymerase II promoter --transcription termination from RNA polymerase II promoter --specific transcription from RNA pol II promoter ----specific transcription initiation from RNA polymerase II promoter --general transcription from RNA pol II promoter ----general transcription initiation from RNA polymerase II promoter

.... and we'll look at the definitions to see where the existing initiation terms go, which new ones to add, and whether any have to be renamed. Would the same go for elongation and termination (I've put them under the parent in the sketch above)?

Val - merging those terms makes sense to me. OK with Karen?

Original comment by: mah11

gocentral commented 17 years ago

Logged In: YES user_id=473890 Originator: NO

Hi,

Sorry so long to get back to this. Was back at work last week, but swamped in some things with more proximal deadlines.

Regarding Val's suggestion to merge this term: GO:0042789 : mRNA transcription from RNA polymerase II promoter into its parent term: GO:0006366 : transcription from RNA polymerase II promoter

I have personally never liked any of these terms, or any of their child terms: tRNA transcription snRNA transcription mRNA transcription rRNA transcription snoRNA transcription

because I think that the real process going on is defined, as Val suggests, by what basic machinery is doing the transcription, not what label we put on the RNA that is produced. For example, most of the type in eukaryotes mRNAs are produced by RNA pol II, but in the organelles like the mitochondria or chloroplast, mRNAs are produced by the various organellar RNA polymerases. The bacterial enzyme, being the only RNA pol in the cell makes all kinds of RNA, m-, t-, and r-. So, knowing that, what process specifically is being defined by the term 'mRNA transcription'?

Getting back to Val's suggestion, I agree with Val that there is nothing that really distinguishes the 42789 term from its parent 6366. Also, as far as I can tell, there's nothing particularly unique about the transcription of snoRNAs with respect to the transcription machinery itself.

However, unless we are also going to get rid of all of these sorts of terms that label RNA transcription by the type of product, I don't see the point. When looking at the children of 'transcription from RNA pol II promoter', they look silly, but when you look from 'mRNA transcription', they are more useful.

It might however, be useful to have things like 'snoRNA production' terms because snoRNAs are produced in a variety of ways. Sometimes they are single gene transcription units, sometimes they are found in polycistronic transcription units, and sometimes they are processed out of the introns of mRNA transcripts for protein coding genes.

In summary, I don't really see any reason to merge just the one term that Val points out unless we are going to overhaul this in a comprehensive and consistent way, but I could certainly see reason to do an overhaul.

Regarding Midori's proposal, I think I like the idea. Regarding the question of whether we'd want to treat elongation in the same way that she suggested for initiation, I'm not sure whether there is need. I'm not even completely sure that there is need to do this for initiation. Perhaps I'll send a quick email to my thesis advisor to get some input from someone more up-to-date in the field.

I will send an email now, but I am on vacation for the rest of the week, so it will be Monday before I can comment further on this.

-Karen

Original comment by: krchristie

gocentral commented 17 years ago

Logged In: YES user_id=436423 Originator: NO

Hi Karen,

Has there been any further word on these changes? I'm not trying to nag; I'm just wondering whether I should go ahead and implement the 'general vs. specific' part of this item -- it seems that that much could be done without waiting for the discussion of any mRNA/rRNA/etc. overhaul.

m

Original comment by: mah11

gocentral commented 17 years ago

Logged In: YES user_id=473890 Originator: NO

Hi Midori,

Apologies, I've been swamped with stuff related to an SGD deadline and didn't get around to sending an email to my former advisor. I've just done that and will add any enlightenment I receive to this tracker item.

-Karen

Original comment by: krchristie

gocentral commented 17 years ago

Original comment by: mah11

gocentral commented 17 years ago

Logged In: YES user_id=436423 Originator: NO

For the time being, I've added just the two most urgently needed terms:

general transcription from RNA polymerase II promoter GO:0032568 specific transcription from RNA polymerase II promoter GO:0032569

I've made up definitions that can probably be improved -- let me know if you want me to make any changes.

I won't interfere with the rest of the work discussed in this item!

m

Original comment by: mah11

gocentral commented 17 years ago

Logged In: YES user_id=516865 Originator: YES

Thanks for doing this Midori, I tried to look using AmiGO, but it isn't updated yest on the GO website which is odd. I though this happend daily.?

Would it make sense to include a parent general transcription which all transcription from POL I and II were grouped under, and general transcription from RNA polymerase II promoter?

(this is what I am trying to break out (core RNA POL I II and III transcription from gene specific transcription), the core machinery is conserved 1:1 complexes in most eukaryotes and the gene specific stuff is larger families and varies between organisms (generally)

Original comment by: ValWood

gocentral commented 17 years ago

Logged In: YES user_id=436423 Originator: NO

> I tried to look using AmiGO, but it isn't updated yet on the GO website > which is odd. I though this happend daily.?

One for the AmiGO wg, not me ... the terms are in the .obo flat file, so they'll show as soon as AmiGO is updated ... but at the mo' it's still saying 2007-08-30 ... I dunno why

> Would it make sense to include a parent general transcription ...?

One for Karen! (tho sorry to make you wait ..)

m

Original comment by: mah11

gocentral commented 17 years ago

Logged In: YES user_id=473890 Originator: NO

Hi,

Well, as I said in my first comment on this item, I'm currently only familiar with people making that distinction for Pol II txn, and not aware that anyone uses those terms for Pol I or Pol III txn. So, I'd vote no, not to make terms for general and specific txn at a level above Pol II right now. Of course, if it turns out that the Pol I and Pol III use those, or comparable terms, I'll be happy to include such terms in the general revision, but right now, I don't think they're really warranted.

-Karen

Original comment by: krchristie

gocentral commented 17 years ago

Logged In: YES user_id=516865 Originator: YES

I still think its a problem for global analysis that there are no biological process terms to distinguish between general and specific transcription. Even if this isn't generally terms that the community use, they seem to be biologically relevent groupings, especially for the analysis of geneome wide data (which is where I am running into the probelm. I can see the patterns in my data, but I don't see the groupings using GO).

I did a quick Google and I found quite a number of refs referring to tanscription from RNA pol III as genral (I couln't do the equivalent search for pol I as it got masked by pol II results.

but here are a few:

White RJ, Rigby PWJ and Jackson SP (1992) The TATA-binding protein is a general transcription factor for RNA polymerase III. J Cell Sci Suppl, 16, 1–7.

Mol. Cell. Biol., 12 1996, 7031-7042, Vol 16, No. 12 
Copyright © 1996, American Society for Microbiology. RNA polymerase III transcription from the human U6 and adenovirus type 2 VAI promoters has different requirements for human BRF, a subunit of human TFIIIB

Hirsch, H. A., Gu, L., Henry, R. W. (2000). The Retinoblastoma Tumor Suppressor Protein Targets Distinct General Transcription Factors To Regulate RNA Polymerase III Gene Expression. Mol. Cell. Biol. 20: 9182-9191

Current Biology Volume 16, Issue 19, Pages R849-R851 Transcription: Adjusting to Adversity by Regulating RNA Polymerase. E. Geiduschek, G. Kassavetis |”In the nucleus, hypophosphorylated Maf1 binds to RNA polymerase III via the ... is efficiently rescued by adding the pol III-specific general transcription ...”

I have also asked Jurgs group if this would be a useful grouping term, ...will let you know their responses (positive or negative or indifferent)

Karen, did you ever get any response from your thesis supervisor about this? or wasn't this what you asked?

Sorry to keep going on about this, but I think its quite an important high level grouping that is needed to make sense high throughput and genome wide data. If you really think it is wrong to do this for some reason I can drop it though.

Val

Original comment by: ValWood

gocentral commented 17 years ago

Logged In: YES user_id=516865 Originator: YES

On a related note, I am wondering if elongation and termination can go directly under general (i.e it is only initiation which can be 'specfic') but I am not sure about this (it seems logical).

Also,in the def of the new term 0032568 it says Saccharomyces five transcription factors are necessary and sufficient for such basal transcription.
(which are the 5?, this will help me to do the remapping to the new term)

Thanks

Original comment by: ValWood

gocentral commented 17 years ago

Logged In: YES user_id=473890 Originator: NO

Hi Val,

responses to your last 2 posts:

-Karen

> Date: 2007-09-04 03:13 > Sender: val_wood > Logged In: YES > user_id=516865 > Originator: YES > > I still think its a problem for global analysis that there are no > biological process terms to distinguish between general and specific > transcription. Even if this isn't generally terms that the community use, > they seem to be biologically relevent groupings, especially for the > analysis of geneome wide data (which is where I am running into the > probelm. I can see the patterns in my data, but I don't see the groupings > using GO).

You may be right, but I am personally not comfortable adding terms at this level now without further reading.

> I did a quick Google and I found quite a number of refs referring to > tanscription from RNA pol III as genral (I couln't do the equivalent search > for pol I as it got masked by pol II results. > > but here are a few: > > [snip]

Thanks for the Pol III refs, that will help when I start to look at this stuff.

One thing to consider about Pol I though is that, unlike both Pol II and Pol III, it doesn't transcribe lots of different genes, it transcribes the rDNA repeat, which I guess is still lots of separate transcripts, but they all have the same promoter. I do seem to recall coming across one exception where someone reported Pol I transcribing a protein coding gene in a trypanosome, if I recall correctly, but that's the only non rDNA thing I can think of transcribed by Pol I, so I'm really not sure that general vs specific applies to Pol I.

> I have also asked Jurgs group if this would be a useful grouping term, > ...will let you know their responses (positive or negative or indifferent)

thanks

> Karen, did you ever get any response from your thesis supervisor about > this? or wasn't this what you asked?

only asked about Pol II, not as a global classification, so my question didn't address your current concern anyway.

> Sorry to keep going on about this, but I think its quite an > important high level grouping that is needed to make sense high > throughput and genome wide data. If you really think it is wrong to > do this for some reason I can drop it though.

I see your point Val, but I personally am not prepared to support these global level terms until I've done more reading. I'm also not keen on adding a bunch of stuff now, and then doing a major reorg in January. I'll be happy to consider this suggestion when I do the major reorg, but would really prefer not to add these terms now.

-Karen

> Date: 2007-09-04 04:02 > Sender: val_wood > Logged In: YES > user_id=516865 > Originator: YES > > On a related note, I am wondering if elongation and termination can go > directly under general (i.e it is only initiation which can be 'specfic') > but I am not sure about this (it seems logical).

Actually, at least for Pol II, I think we may need instances of both general and specific for both initiation and elongation. My thesis project was about regulated blocks to elongation. I also think think that some/many promoters have initiation at a basal level that is mediated only by the general factors and then have activated initiation that is mediated by a specific factor. For 'termination', the preferred term for Pol II was '3'-end formation' since 'termination' as it occurs in bacteria, where there is a specific DNA signal that tells the polymerase to stop really just doesn't occur. Pol II transcribes past the Poly A site and then transcription ceases across a range of the downstream sequence in conjuction with cleaving the poly A site. For Pol I and Pol III, I think those fields may use termination and it may be simpler than in Pol II.

In any case, I'm not currently prepared to classify any of these three phases as always general or always specific. We may even have to do this on a polymerase by polymerase basis.

> Also,in the def of the new term 0032568 it says Saccharomyces five > transcription factors are necessary and sufficient for such basal > transcription. (which are the 5?, this will help me to do the > remapping to the new term)

I would have to assume that they are talking about TFIIA, TFIIB, TFIID, etc. (I'm not completely sure off the top of my head which 5 of the TFII___'s are considered to be the basal factors versus something else, for example, I think TFIIE is considered to be an elongation factor, but in any case I don't think this is referring to specific genes.

> Thanks

Original comment by: krchristie

gocentral commented 17 years ago

Logged In: YES user_id=516865 Originator: YES

I'll continue to think about it.

I should point out though that the changes I suggested wouldn't affect annotations they will just provide an additional grouping. I don't think I made it clear but I wasn't requesting additional terms for general and specific transcription for RNA pol I and III, but that both RNA pol I and III had the 'general transcription' parent' , on the basis that neither of these types of transcription are associated with 'regulatory transcription factors' and all are associated with basal transcription. A synonym of the term 'general trsanscription' could be 'basal transcription'...this is what I'm trying to capture with a single grouping term.

Val

Original comment by: ValWood

gocentral commented 17 years ago

Logged In: YES user_id=473890 Originator: NO

Let me rephrase, I am not comfortable making connections between Pol I txn or Pol III txn and either general or specific transcription until I've read more (which I'm not going to do that until January) and become convinced that this is warranted and also how it should be done.

Anyway, since the terms Midori added:

general transcription from RNA polymerase II promoter GO:0032568 specific transcription from RNA polymerase II promoter GO:0032569

are child terms of 'transcription from RNA polymerase II promoter', I don't see that there currently are higher lever 'general txn' or 'specific txn' terms to even use for what you're asking.

It's not that you don't have a good point, one that I'll certainly take into account when I do the reorg, but I don't think it's as simple as giving both RNA pol I and III a parent of 'general transcription'. Especially for Pol III, this might be wrong. For example, txn of 5S genes requires a specific txn factor called TFIIIA, which is required for Pol III txn of 5S genes, but not at any other Pol III genes, so it might be warranted to have specific txn for Pol III. However at moment, I don't know exactly how these relationships should be made.

Thus, my preference is not to make any more changes in txn area right now. The area is already a big mess and I think adding more terms quickly now will just make it worse.

Let me also add, this is pretty much my final comment on this item for a while. I've been sick, as in not working at all, for almost a week and am not done with the Evidence Code stuff that I HAVE to send out before I go on maternity leave, so I really don't have time to spend any more time on this discussion before I go out on maternity leave.

-Karen

Original comment by: krchristie

gocentral commented 16 years ago

Logged In: YES user_id=516865 Originator: YES

Other existing terms that are relevant to the proposed specific/general transcription terms:

regulation of gene-specific transcription transcriptionGO:0032583G1 S-specific transcription in mitotic cell cycle G2-specific transcription in mitotic cell cycle S-phase-specific transcription in mitotic cell cycle

Original comment by: ValWood

gocentral commented 16 years ago

Logged In: YES user_id=516865 Originator: YES

Hi Karen,

Is there any progress with this item? i was wondering whether we could implement the term 'general transcription'

as a child of transcription DNA dependent.

with the children genaral transcription from RNA polymerase II promoter transcription from RNA polymerase III promoter transcription from RNA polymerase I promoter

and if this could be done independently of the transcription overhaul.

You had some concerns that not all RNA pol I and III transcription was considered general, but I have searched and i can't find any examples where this is not the case (they may also be specific, but this is OK)

If some new research shows that part of RNA polymerase I or III transcription is NOT general this is easy to rectify by creating child terms to mirror the current situation for Polymerase II.

Ther reason I need this term is that it is now holding up my enrichment analysis for a number of projects.

For instance I have this note from the Nurse lab: (QUESTION VAL about GO terms, are they all at the same level. In some situations I think we will also need to group together some of these categories into eg transcription from RNA Pol III, transcription from RNA Pol I, transcription initiation all into Transcription, for easy of discussion in the text and I’m not sure which categories they will go into. )

The problem here is that the analysis only shown enrichment fro Pol I, general II and III individually, but no enrichment for 'transcription' which includes 'specific transcription'. This is what I expected but if there was a 'general term' this would be enriched and would simplify the output considerably.

This term is also useful for biologically informative GO slims (see recetn posting about GO slims). Jurgs lab (who work entirely with transcriptional analysis) have also asked why this distinction is not possible with GO using a siongle term.

My collaborators are hoping to submit at the end of May, and I need to re-run the analysis before then to show the enrichment to 'general transcription'

I realise you have other priorities but as this is a relatively simple change, I wondered, if you agree whether it can be implemetned shortly?

Thanks

Val

Original comment by: ValWood

gocentral commented 16 years ago

Logged In: YES user_id=473890 Originator: NO

Hi Val,

At the moment, I'm not really comfortable with giving Pol I or Pol III transcription a relationship to 'general transcription' since I've really only encountered that phrase in the context of Pol II transcription. It feels to me like we're using the phrase 'general transcription' in a way that is different from the way it is normally used in the literature.

-Karen

Original comment by: krchristie

gocentral commented 16 years ago

Logged In: YES user_id=516865 Originator: YES

Hi Karen,

I see your point if 'general transcription' is only ever used in the context of pol II. When I googled for "general transcriuption from pol III" I got a number of hits, but I am not sure how valid the sources are, perhaps you could assess this better.

If 'general transcription" is not suitable, is there another phrase we could use for the inititation, elongation and termination from pol I, II and III?

There are instances where groupling terms are defined by their component parts like for instance: GO:0055086 nucleobase, nucleoside and nucleotide metabolic process if the biological grouping makes sense, but the terminology to make the grouping did not exist. Perhaps we could do something like this?

Val

Original comment by: ValWood

gocentral commented 16 years ago

Logged In: YES user_id=473890 Originator: NO

Hi Val,

The problem is that until I get deeper into the txn reorg, I'm just not sure if this makes sense or not.

sorry,

-Karen

Original comment by: krchristie

gocentral commented 14 years ago

Val,

I'm working on this actively with David Hill now and was rereading some stuff in this item.

I have a question about what you are really trying to do. In the initial posting, you said "I need to be able to distinguish between specific and general transcription at the process level". But in your post of 2008-05-10, you said "If 'general transcription" is not suitable, is there another phrase we could use for the inititation, elongation and termination from pol I, II and III?"

I am now confused by what you want to do. You can't equate "basal" or "general" with "inititation, elongation and termination"; that isn't what basal means. Those three things occur regardless of whether it is basal or activated/specific transcription.

What are you actually trying to distinguish?

-Karen

Original comment by: krchristie

gocentral commented 14 years ago

Hi Karen,

What I was trying to see was a grouping term for everything which is not gene specific transcription.

I thought this was referred to as general transcription, but my terminology might be incorrecvt.

So, everything which was required for i) transcription from Pol I ii) transcription from Pol II or iii) transcription from polIII

But was NOT related only to gene specific transcription controlled by gene specific transcription foctors (or sometimes by transcriptional regulators.

When you perforn enrichment analysis, if your genes set is enriched for these processes it can sometimes be blow threshold, or lower significance than expected because the annotations are distributed between the 3 terms.

I got around this in the last analysis I performed by grouping these 3 datsets in the results to say that the enriched genes were involved in general transcription from PolII, and transscription from polIII and pol1, by extracting all of the genes which I know are required for transcription and creating an artificialgroupoing term in order to see the extent of the enrichment.

Essentially I wanted a term to group GO:0032568 general transcription from RNA polymerase II promoter with PolII and Pol II transcription to cover all of the apsects of transcription processes which are not condition specific or phase specific and not based on specific transcription factor binding.

I might not be possible to do this but it would be a very useful grouping term for using GO in analysis. (when I mmade this request I am not even sure if we had the GO:0032568 term, this helps a lot).

Val

Original comment by: ValWood

gocentral commented 13 years ago

Please see: http://wiki.geneontology.org/index.php/Transcription

Original comment by: rebeccafoulger

gocentral commented 13 years ago

Original comment by: rebeccafoulger

gocentral commented 13 years ago

Original comment by: rebeccafoulger

gocentral commented 13 years ago

see: http://wiki.geneontology.org/index.php/Fate\_of\_terms\_referring\_to\_general/nonspecific/basal\_versus\_specific\_transcription

Original comment by: krchristie

gocentral commented 13 years ago

Original comment by: krchristie

gocentral commented 13 years ago

Original comment by: mah11