YangLab / CIRCexplorer2

circular RNA analysis toolset
http://circexplorer2.readthedocs.org/
Other
76 stars 41 forks source link

Reads per gene vs annotation file generated with CIRCexplorer2 #74

Open andre-gabriel-42 opened 7 months ago

andre-gabriel-42 commented 7 months ago

For instance, let's say that my reads per gene contains the following information for these two transcripts (Reads per gene file obtained with STAR):

**ENST00000361204** 0   0   **0**
**ENST00000255784** 842 0   **842**

So 0 reads per ENST00000361204 transcript and 842 reads per ENST00000255784 transcript.

And the file generated during annotation step contains lines like the following:

chr22   42276719    42290941    circular_RNA/19 0   +   42276719    42276719    0   0   0   4   277 170 169 118 0   4126    12401   14104   19  circRNA SREBF2  **ENST00000361204** 10  11  12  13  chr22:42274127-42276719|chr22:42290941-42293055

chr22   42204878    42206295    circular_RNA/260    0   +   42204878    42204878    0   0   0   3   119 122 85  0   1004    1332    260 circRNA CCDC134 **ENST00000255784** 2   3   4   chr22:42196770-42204878|chr22:42206295-42209267                     

chr22   42204878    42209826    circular_RNA/81 0   +   42204878    42204878    0   0   0   5   119 122 85  182 72  0   1004    1332    4389    4876    81  circRNA CCDC134 **ENST00000255784** 2   3   4   5   6   chr22:42196770-42204878|chr22:42209826-42221701

Questions for you: 1) It seems that a circRNA had 0 reads yet it was annotated (ENST00000361204). Why? What does it mean? 2) For ENST00000255784 transcript, it is annotated twice as two different circRNAs. How would I know the weight of this transcript? How sure am I that it would not be a 3rd transcript? Should I use bedfiles? At which step? 3) After the annotation step, how do you actually cross the information between the circRNAs that were aligned and annotated? How do you extract the exact coordinates from the reads per gene file?

Thanks a lot for your help.