geneontology / go-ontology

Source ontology files for the Gene Ontology
http://geneontology.org/page/download-ontology
Creative Commons Attribution 4.0 International
219 stars 40 forks source link

RNA-dependent RNA replication #9004

Closed gocentral closed 9 years ago

gocentral commented 12 years ago

An RNA replication process that uses RNA as a template for to synthesize the new strands. This process occurs in RNA viruses that replicate without a DNA intermediate.

This is a child of GO:0032774 : RNA biosynthetic process

Note: the corresponding molecular function is EC 2.7.7.48, GO:0003968

Reported by: *anonymous

Original Ticket: geneontology/ontology-requests/8793

gocentral commented 12 years ago

Sorry about the www.google.com login, this case is from daniel_haft, now properly logged in.

Original comment by: daniel_haft

gocentral commented 12 years ago

Hi Daniel,

We have the following GO term under RNA biosynthetic process:

transcription, RNA-dependent ; GO:0001172 The cellular synthesis of RNA on a template of RNA.

Does this cover your process ok? I'll add in synonyms and a process-function link to EC:2.7.7.48, if so to make it more searchable.

Thanks, Becky

Original comment by: rebeccafoulger

gocentral commented 12 years ago

PMID:19230160 "...the viral RNA-dependent RNA polymerase ... uses the negative-sense vRNA as a template to synthesize two positive-sense RNA species: mRNA templates for viral protein synthesis, and complementary RNA (cRNA) intermediates from which the RNA polymerase subsequently transcribes more copies of negative-sense, genomic vRNA"

So, it looks to me like

  1. GO:0009299 : mRNA transcription (whose definition says from DNA template) needs a new synonym "mRNA transcription from DNA template", and a new sister term " mRNA transcription from RNA template". The definition for GO:0009299 could be updated to say that "from DNA" is implicit, and that "from RNA" is a sister.

  2. It looks like "transcription, RNA-dependent ; GO:0001172" should have two children. One is "mRNA transcription from RNA template" and the other is "RNA-dependent RNA replication"

  3. RNA-dependent RNA replication (but not its parent term GO:0001172) deserves to be in the GO tree as a replicative process.

Original comment by: daniel_haft

gocentral commented 12 years ago

Original comment by: rebeccafoulger

gocentral commented 12 years ago

Hi,

Here are some comments from the perspective of the mostly completed (DNA-dependent) transcription overhaul.

Daniel, I'm really glad you're chiming in to provide some clarity under the "transcription, RNA-dependent" term. When I obsoleted the previous "transcription, RNA-dependent" term which was defined such that it really meant "reverse transcription" and not production of RNA at all, it was clear to me that the term "transcription, RNA-dependent" should remain broad so that it could cover multiple different types of RNA-dependent transcription, and also clear that I did not have sufficient understanding of the multiplicity of viral life cycles to make any terms that were more granular, nor the time to gain that understanding. Thus the transcription overhaul only dealt with the DNA-dependent transcription terms. I have listed some issues with the viral and symbiont transcription terms that I noticed in these 2 SF items:

positions, defs of host & symbiont transcription terms https://sourceforge.net/tracker/?func=detail&aid=3300742&group\_id=36855&atid=440764

position, def of "viral transcription" & its reg terms https://sourceforge.net/tracker/?func=detail&aid=3298090&group\_id=36855&atid=440764

-Karen

Using the numbered items from Daniel's 2011-09-19 comment, here are some specific comments on each of those.

  1. Regarding the term "mRNA transcription" (GO:0009299)

I think I should add some historical perspective about the top of the transcription branch. When we looked at the annotations made to the term "transcription", it was clear that it was being misused for annotations when "transcription, DNA-dependent" was intended. Because of the vast number of annotations to the term "transcription" that should have been to "transcription, DNA-dependent" (tens of thousands, see counts on the wiki), the group agreed to merge the term "transcription" into "transcription, DNA-dependent". The common parent of "transcription, DNA-dependent" and "transcription, RNA-dependent" is now "RNA biosynthetic process".

Following on from the fact that it clearly is a problem to have terms with names like "transcription" that group the DNA-dependent and the RNA-dependent processes, but the folks annotating the DNA-dependent process frequently fail to realize that they should be using a more specific term, I have been adding the phrase "DNA-dependent" to all terms under "transcription, DNA-dependent" that do not have some other phrase that is already more specific, e.g. "transcription from RNA polymerase II promoter" doesn't need to include the phrase "DNA-dependent" because it is already specific to an RNAP that is DNA-dependent. However, we cannot have a term like "transcription elongation" anymore because this is ambiguous. So, if we are going to keep this term for "mRNA transcription", then the actual term name needs to change to i.e. "mRNA transcription, DNA-dependent" to be consistent with its existing definition. A parallel term for "mRNA transcription, RNA-dependent" would not be a sibling of this term because it would need to have "transcription, RNA-dependent" as a parent, not "transcription, DNA-dependent".

There is another issue with the term "mRNA transcription" though, which is that classification by the "type" of RNA produced doesn't seem to be a very good criterion for the process. For example, for RNAP II, the process for "mRNA transcription" isn't particularly different than that for "snoRNA transcription", it just uses a different promoter. David and I considered obsoleting these types of terms, and some researchers at the transcription meeting I attended in June 2010 were supportive of that approach, but we decided to table that issue and did not address it as part of the transcription overhaul. If we do keep the term, we may want to consider whether/how it should be used for annotations, when it is not well integrated into the transcription cycle process terms (for initiation, promoter clearance, elongation, termination) that are a much better way to represent how a gene product is involved in transcription. Note that AmiGO only lists 31 annotations for "mRNA transcription" and it's 4 children combined, and more than half of those are to the RNAP II specific term.

2 & 3. new children for "transcription, RNA-dependent ; GO:0001172" and their parentage

> 2. It looks like "transcription, RNA-dependent ; GO:0001172" should have > two children. > One is "mRNA transcription from RNA template" > and the other is "RNA-dependent RNA replication" > > 3. RNA-dependent RNA replication (but not its parent term GO:0001172) > deserves to be in the GO tree as a replicative process.

Can you provide definitions, or a reference that does, for the 2 child terms you suggest for "transcription, RNA-dependent". One of the reasons, we opted not to make any child terms of "transcription, RNA-dependent" was that I do not have enough background to know what should be represented. We weren't even completely sure how to define "transcription" in the context of RNA-dependent processes.

Regarding your suggestion for a "RNA-dependent RNA replication" term and you subsequent comment that it should be indicated as a replicative process, I have a question. Is this considered a type of "transcription" at all? In the DNA metabolism branch, we have a term for "DNA replication". Perhaps we need to have a term for "RNA replication" under "RNA biosynthetic process".

Original comment by: krchristie

gocentral commented 12 years ago

I see there are two different GO terms, GO:0006410 (obsoleted, head meant reverse transcription, which is a replicative process) and GO:0001172, both for the phrase "transcription, RNA-dependent." This latter one scares me a bit, "the CELLULAR synthesis of RNA on a template of RNA." Does "cellular" specifically mean non-viral? Otherwise, there is some overlap with "viral transcription", but that term makes it unclear if the process is from DNA or RNA, and that specificity is needed somehow.

The clearest thing for me to say is that I am not (yet) a viral expert, nor can the ontology structure be made perfect without huge effort. Like bidding in bridge, the aim is probably to pick the definition "that lies the least," the best compromise of technically correct GO tree inheritance, labor affordability, and expectations of users.

My original request was to get a process term installed to cover replication processes for viruses. When an RNA virus copies its own RNA, the process may be replication, or production of mRNA, or both. So, a single protein could get two different process terms.

I recommend thinking of the central dogma first, chemistry second, host-vs-virus last (because viruses may co-opt cell machinery). Is a process replicative? Then it needs representation in the tree under replication. RNA-dependent RNA replication should be added immediately.

Or is the process using a genetic template to produce an RNA that has a function other than replication (mRNA, regulatory RNA, etc.). This I would call transcription, in a biological process sense. There is a more "molecular function" sense, where DNA to DNA is copying, RNA to RNA is copying, while production of one off the other, a change of alphabet, is "transcription." But that usage does not fit the "biological process" sense, in which germline to germline is replication, but germline into RNA with a non-germline function is transcription, and that is what I recommend for the meaning of transcription in the biological process tree.

I am afraid I cannot invest too much more in this thread - I was looking for a low-cost way to help out and add value, but I'm close to max'd out.

I will, on the other hand, be on the lookout for really cute new viral biological processes, like "cap-snatching."

Thanks,

Dan Haft

Original comment by: daniel_haft

gocentral commented 12 years ago

HI Daniel,

I'll try to respond to some of the questions you asked.

> GO:0001172, both > for the phrase "transcription, RNA-dependent." This latter one scares me a > bit, "the CELLULAR synthesis of RNA on a template of RNA." Does "cellular" > specifically mean non-viral? Otherwise, there is some overlap with "viral > transcription", but that term makes it unclear if the process is from DNA > or RNA, and that specificity is needed somehow.

In the context of GO, "cellular" means it occurs within a cell. It does not make any restriction to using only host machinery, so things that viruses do within a cell are included in "cellular". The contrast to "cellular" in GO is "organismal" because some process happen between cells rather than within an individual cell. The term "transcription, RNA-dependent" is meant to explicitly cover any process that transcribes RNA from a template of RNA, and to me the fact that it says "on a template of RNA" makes that clear. The sibling process "transcription, DNA-dependent" is for transcription of RNA that occurs from a DNA template. A replication process that involves going through DNA is not making RNA on a template of RNA, and there is already a term for "reverse transcription" (making RNA from DNA) anyway.

> My original request was to get a process term installed to cover > replication processes for viruses. When an RNA virus copies its own RNA, > the process may be replication, or production of mRNA, or both. So, a > single protein could get two different process terms.

Are you aware of the term "viral genome replication" (GO:0019079)? It also has a number of more specific child terms. These seem like they may already cover what you want.

> I recommend thinking of the central dogma first, chemistry second, > host-vs-virus last (because viruses may co-opt cell machinery). Is a > process replicative? Then it needs representation in the tree under > replication. RNA-dependent RNA replication should be added immediately.

I actually have not been finding the central dogma particularly useful with respect to defining GO terms. It is really too simplistic to accurately represent what people actually see. In addition, the research literature is filled with differing usage of the same phrase, such that sometimes GO can not follow the exact terminology in the literature because the terminology in the literature is ambiguous or is used for multiple meanings. The question I have been using for defining distinct GO terms is does it happen the same way. If it does, it's the same process. If that process is used in a couple different larger processes, then we can represent that with relationships between terms.

The question I have with respect to adding a term for "RNA-dependent RNA replication" is how is this different than the term we already have for "transcription, RNA-dependent". The are both production of RNA from a template of RNA. Now if "RNA-dependent RNA replication" has some additional steps needed as replication of an entire viral genome may require some tricks to replicate the ends that are not necessary, then it is not exactly the same as "transcription, RNA-dependent", though perhaps "transcription, RNA-dependent" is a component of "RNA-dependent RNA replication".

-Karen

Original comment by: krchristie

gocentral commented 12 years ago

As I said 9/19, two different products from RNA synthesis off of viral RNA. One is message to be translated, which gets capped. The other is template for replication.

Good to know about "viral genome replication". I believe "RNA-dependent RNA replication" will always be a child of it, but not a replacement for it.

The problem I have with the suggestion that "GO:0001172 : transcription, RNA-dependent" exists, and therefore my RNA replication term is not needed, is the hole that is left, where the GO tree has terms for RNA replication through reverse transcription and a DNA intermediate, and for DNA replication, but not for RNA replication without a DNA intermediate. Instead, the term we are supposed to use looks like a term for producing functional RNAs such as messenger RNA.

This will probably have to be my last letter on this subject.

Original comment by: daniel_haft

gocentral commented 12 years ago

Can you suggest any good reviews on this subject?

The question I still have is how are the process of "transcription, RNA-dependent" and "RNA-dependent RNA replication" similar and different. If some of the steps are conducted in the same way by the same enzymes and gene products, then I would actually want to see a connection between the terms representing these processes in GO, but I do not yet understand either of these processes in sufficient detail.

In your last message, you indicated that the "mRNA" gets capped while the template for replication does not. This might suggest that there is an RNA replication process that is common to both "transcription, RNA-dependent" and "RNA-dependent RNA replication". However, I can also envisage another scenario, where "transcription, RNA-dependent" only transcribes portions of the genome, while "RNA-dependent RNA replication" will need additional or alternate mechanisms to transcribe the entire genome.

-Karen

Original comment by: krchristie

gocentral commented 12 years ago

How about this one ? PMID: 12213662 Let's see, if it's making mRNA then it is Capped poly-A-tailed REGULATED with early genes and late genes. And it is NOT the whole genome.

Or there is this PMID: 15298168 "RNA genomes of these viruses are templates for two distinct RNA synthetic processes: transcription to generate mRNAs and replication of the genome via production of a positive-sense antigenome that acts as template to generate progeny negative-strand genomes. The four virus families within the Mononegavirales all express the information encoded in their genomes by transcription of discrete SUBGENOMIC mRNAs. "

I suggest the link between the two is largely the link of MOLECULAR FUNCTION. I think the whole problem of this line of discussion is related to the fact that in some contexts, the word "transcription" is about producing "transcripts" meaning any product of the enzymes, but in other contexts is understood to mean making things from a genome that won't be acting as genome, but rather as some other kind of RNA such as mRNA.

I'm done now. But will gene a GO term for "cap-snatching", the viral process of getting an mRNA cap by cutting it from a host mRNA instead of making its own. A strategy some viruses use, when others make the cap. But that will be an entirely separate ticket.

Original comment by: daniel_haft

gocentral commented 12 years ago

Daniel - just to say I'm revising the GO viral terms as we speak, and have already added a term for 'cap-snatching'. I'll make a separate comment on the RNA replication issue.

Original comment by: jl242

gocentral commented 12 years ago

I do see Daniel's point about not wanting to use transcription, RNA-dependent for viral genome replication. In RNA viruses, genome replication and mRNA transcription is highly coupled and probably uses the same RNA-dependent RNA Pol complex but there are almost certainly cases where you'd want to distinguish between the two. For example in ss (-) RNA viruses transcription is off the - strand (i.e. the genome), primed at the leader sequence, to make a + sense mRNA. Genome replication then comes later, the (-) sense genome is primed in the opposite direction, at the trailer sequence to make new genomes. So transcription and genome replication is in opposite directions.

What I'd suggest is that we have a generic 'RNA-dependent RNA replication' as Daniel suggests, with 'transcription, RNA-dependent' as a child. The difficulty is going to be defining them...viral transcription CAN be of the full length of the genome - it then gets translated into a polyprotein which is cleaved. Viral mRNAs also don't always require caps or poly A tails as they have sneaky methods of circumventing these requirements, for example, internal ribosome entry sites within the 5' untranslated region (UTR) of the viral mRNA which bind the host 43S preinitiation complex, circumventing regular cap-dependent translation initiation.

Original comment by: jl242

gocentral commented 12 years ago

I want to be clear, I'm not trying to argue against having a term for "RNA-dependent RNA replication", just trying to understand what is the distinction between "RNA replication" and "RNA transcription". When we touched on this briefly in the transcription overhaul due to the need to deal with the term "transcription" and the incorrectly defined original term for "transcription, RNA-dependent", the distinctions versus similarities of these two process were particularly hard to discern. It seems that if there are some aspects of each of these 2 processes that are identical, then it would be nice to represent in GO which, if any, parts of the overall replication and mRNA production processes are identical.

I am also concerned about the terms like "mRNA transcription" because, to use RNAP II as an example, there is nothing particularly different about the transcription of an mRNA versus a snoRNA or any other RNA transcribed by RNAP II. David and I felt these terms were problemetic but weren't prepared to eliminate them. If the things that distinguish a mRNA from other RNA transcripts are things like the 5'-end processing (type of cap addition) and the 3'-end processing (pA tail or any other 3' processing), then what we're really talking about is "mRNA production" where "transcription" is only one of the parts of that process. So, I'd like us to be cautious about adding more terms under "mRNA transcription" when what we have already doesn't fit well into the structure of the rest of the transcription terms. When I've seen the "mRNA transcription" terms used in annotations for gene products involved in RNAP II transcription of the host genome, it's really not the best term to use and these annotations end up being off in left field away from the bulk of annotations that use the terms that describe in more detail how a given gene product is involved in transcription.

Original comment by: krchristie

gocentral commented 12 years ago

I actually wasn't intending to make an mRNA transcription term, and all those existing viral terms will be removed to a new parent in the overhaul. I'd be vary about calling viral RNAs of any kind mRNAs, because often they don't have caps or poly A tails. Sometimes the transcribed RNA is the whole genome.

Can you not obsolete mRNA transcription and just use the 'transcription from xxx promoter' terms instead? It must be really confusing for curators...

Original comment by: jl242

gocentral commented 12 years ago

Hi Jane,

David and I talked about the term "mRNA transcription" this morning. While we had not felt comfortable obsoleting it previously, we have now come up with an idea that we both like, that does include obsoleting it. The "transcription from xxx promoter" terms are not the appropriate replacement though, as, for example, the "transcription from RNAP II promoter" includes production of mRNAs, snRNAs, and snoRNAs. We are thinking about having terms like "mRNA biosynthetic process" and "mRNA biosynthetic process from RNA polymerase II promoter" so that we can indicate that mRNA biosynthesis from an RNAP II promoter includes both transcription and processing.

One other comment: An mRNA in general is not required to have a cap or pA tail, certainly people talk about mRNA in bacteria where the mRNA processing, if any, is not the same as in eukaryotes. I think the only real distinction for an mRNA is that it is translated. My knowledge of viruses that infect eukaryotes extends far enough to be aware that they vary a lot in how much of their own machinery they carry, so it seems possible that viral mRNA structures may vary a lot between extremes of duplicating the structure of host mRNAs versus doing something rather different, and of course there are also viruses that infect bacteria where the host mRNA structure doesn't include caps or pA tails anyway.

Anyway, if at any point you want any input from me to help make sure the viral transcription terms are structured similarly to the way we've just done the overhaul of the "transcription, DNA-dependent" terms, just let me know.

-Karen

Original comment by: krchristie

gocentral commented 12 years ago

Hi Karen - I like that solution - so did you mean you wouldn't obsolete mRNA transcription, just make it part_of "mRNA biosynthetic process"?

I've been looking at SO and yes, you're absolutely right, mRNAs don't need polyA tails or caps. I'm unsure what to use for viruses now. SO defines mRNA as "...the intermediate molecule between DNA and protein.". The problem is that for +RNA viruses the same 'mRNA' could end up being translated into a protein or being encapsulated as the genome for a new viral particle. It might be better to use a more generic term for viruses.

I might well be in touch Karen!

Original comment by: jl242

gocentral commented 12 years ago

Hi Jane,

Regarding the term "mRNA transcription", I think it will still be better to obsolete it, as for an RNAP that makes more than one "kind" of RNA, e.g. RNAP II makes mRNA, snoRNA, & snRNA or RNAP II makes rRNA and tRNA, there is no difference in the transcription between those different types of molecules. So, having not really worked it out yet, I was thinking more along the lines of obsoleting all the "mRNA transcription", "snoRNA transcription", etc. terms and doing something more like this:

- RNA biosynthetic process (term already exists) -- is_a mRNA biosynthetic process (new term) --- is_a mRNA biosynthetic process from RNAP II promoter (new term) ---- has part transcription from RNAP II promoter (term already exists) ---- has part [RNA processing] (not sure exactly what term is true]

We'll need to think this through, particularly with respect to what types of processing always occur. For example, there are two kinds of caps made from RNAP II txpts, one for mRNA and a different one on snoRNA, but I believe there are some differences for histone mRNAs. Splicing can not have a has_part relationship to a term for "mRNA biosynthetic process from RNAP II promoter" as not all mRNAs produced from RNAP II promoters contain introns. I think we probably cannot say anything useful about the parts that exist for "mRNA biosynthetic process" generally as they differ so much between different RNAPs, e.g. between euk RNAP II, bacterial RNAP, and various viral processes.

Regarding your comment "that for +RNA viruses the same 'mRNA' could end up being translated into a protein or being encapsulated as the genome for a new viral particle", yes, this is exactly why I wanted to learn about what, if anything, is the difference between "RNA-dependent RNA replication" and "transcription, RNA-dependent" with respect to viruses. It had seemed to me that the use of a given word sometimes seemed dependent on the context that a particular person was studying. It also might be that it is not possible to define the relationship between "RNA-dependent RNA replication" and "transcription, RNA-dependent" generally for viruses, but for each given class of viruses, we may be able to define this.

-Karen

Original comment by: krchristie

gocentral commented 10 years ago

Hi Daniel

I think all of the viral transcription/genome replication terms should all now be in place. We do have a term for cap-snatching too: GO:0075526.

Let me know if there is anything missing or that doesn't work for you.

thanks,

Jane

Original comment by: jl242

gocentral commented 10 years ago

Original comment by: jl242

gocentral commented 10 years ago

Original comment by: jl242

gocentral commented 10 years ago

Hi

You can use:

positive stranded viral RNA replication ; GO:0039690

or

negative stranded viral RNA replication ; GO:0039689

Original comment by: jl242