Open ruixuan-zhang opened 2 years ago
Dear developer,
Good day. Thank you for your development and maintenance of this software.
I was wondering if you could explain about the definitions of different classes of
TisType
?I see in
README
thatTisType
refers to the relative position of the TIS to annotated ORF of the transcript.First, in my results, I got some predictions like
3' UTR
,5'UTR
andExtended
.
- Can I understand the class
Extended
in a way that if an assembled transcript from RiboSeq data is aligned to the annotated CDS region and the transcript is continuous without frameshift and extends outside of the annotated CDS, it is annotated asextended
. Yes, without frameshift or stop codon, resulted in an extended form of annotated CDS.- While the
5'UTR
and3'UTR
means that the TIS of a transcript is aligned to these untranslated regions and not assembled into the transcript of the CDS part (or not in the same frame)? The ORF of 5'UTR type may have some overlap with annotated CDS, but not in the same frame.Second, I also got some
Internal
andInternal:CDSFrameOverlap
- I see
CDSOverlap
means the ORF overlaps with annotated CDS in another transcript in the same reading frame.Does
Internal
mean that a predicted ORF
- locates within an annotated CDS (both ends locate within the annotated one) Only consider the TIS position, not necessary both ends within annotated CDS.
- is in different frame Right.
- Does
internal:CDSFrameOverlap
means a predicted ORF locates within an annotated but in the same frame? The predicted one is not in the same frame with annotated CDS, but in the same frame with CDS in another transcript.In the end, I am working on a virus genome with a high coding density. What if a predicted ORF, started in the upstream gene's CDS or 3'UTR region and ends in the downstream genes' CDS region in a different frame. What will the
TisType
be? Is thatNovel
or3'UTR
? Just based on the start position. It should be 5'UTR if started in the upstream. 'Novel' means the transcript has no CDS annotation. So it depend on annotation of the given transcript.Thank you very much in advance!! Ruixuan
Thank you for your prompt reply! Now I understand better about TisType
.
I was wondering if you could help me with one more question that
If a predicted ORFs, its TIS is aligned to the annotated TIS, but the end of this predicted ORFs extended outside of the annotated one. For example, stop codon recoding events or stop codon bypass events, what will be the TisType
of this case?
My preliminary guess is that the TisType
is Annotated
and extension at 3'end can be found by comparing GenomePos
or Start:Stop
with the CDS region in the previous annotation file, right?
Thank you very much!
The type should be 'Annotated'. The prediction of 3' end extention is not supported currently. This may happen in case of different stop codons. If so, you can compare the predicted stop position with the annotated one. In addition, the 3' extended CDS region may be identified as another 3'UTR ORF, if there is a TIS codon in it.
Thank you very much for your patient explanation!
Dear Zhang,
Good day. Sorry, I have another question about the meaning of "GenomePos" and "Start & Stop".
In the README
file, it is written as
Can I understand in a way that
By the way I want to ask if truncated
represents cases whose predicted start codon is in the downstream region of annotated start codon and leads the same f0 frame?
I asked this because I got a result like below.
Thank you very much in advance.
Ruixuan
The 'Truncated' should be what you suppose to be. For the example, could you provide the detailed information including Transcript, CDS, GenomePos, Start and Stop? The start and stop are relative to the 5' end of transcript (usually 5'UTR), and corresponding to the two positions of GenomePos.
A new module 'transplot' is added in the github but not formally released. You can git clone and try to plot using 'ribotish transplot' with '--morecds' option.
Yeah, sure
I found my mistakes in plotting that I forgot to consider the strand information.
Then, it makes sense in this case, the start part has a truncation. GenomePos represents the start codon : stop codon predicted by RiboTISH right?
I was wondering how can I know where the transcript starts? Does RiboTISH use and follow the annotation in gff file?
Thank you!
Right. The start site is from transcript annotation in gff file.
Thank you very much for your patient explanation!
Dear developer,
Good day. Thank you for your development and maintenance of this software.
I was wondering if you could explain about the definitions of different classes of
TisType
?I see in
README
thatTisType
refers to the relative position of the TIS to annotated ORF of the transcript.First, in my results, I got some predictions like
3' UTR
,5'UTR
andExtended
.Extended
in a way that if an assembled transcript from RiboSeq data is aligned to the annotated CDS region and the transcript is continuous without frameshift and extends outside of the annotated CDS, it is annotated asextended
.5'UTR
and3'UTR
means that the TIS of a transcript is aligned to these untranslated regions and not assembled into the transcript of the CDS part (or not in the same frame)?Second, I also got some
Internal
andInternal:CDSFrameOverlap
CDSOverlap
means the ORF overlaps with annotated CDS in another transcript in the same reading frame.Internal
mean that a predicted ORFinternal:CDSFrameOverlap
means a predicted ORF locates within an annotated but in the same frame?In the end, I am working on a virus genome with a high coding density. What if a predicted ORF, started in the upstream gene's CDS or 3'UTR region and ends in the downstream genes' CDS region in a different frame. What will the
TisType
be? Is thatNovel
or3'UTR
?Thank you very much in advance!! Ruixuan