bioinfo-biols / CIRIquant

circular RNA quantification tools
https://sourceforge.net/projects/ciri/files/CIRIquant
MIT License
28 stars 18 forks source link

Large DS Scores for Unexpressed Genes #11

Closed DarioS closed 4 years ago

DarioS commented 4 years ago

When I sort the results of CIRI_DE from highest to lowest by DS_score column in R, I see

> head(C10)
                   circRNA_ID Case_BSJ Case_FSJ Case_Ratio Ctrl_BSJ Ctrl_FSJ Ctrl_Ratio DE_score DS_score
609    chr4:70000863|70053127        0        0          0        2   446681          0        0 13.46153
604    chr4:70036267|70050495        0        0          0        4   717701          0        0 13.12248
14991 chr12:11267411|11308807        0        0          0        2   172556          0        0 12.08316
2091  chr12:11353223|11353684        0        0          0        1    72125          0        0 11.83855
4419  chr12:11267348|11267632        0        0          0        1    66157          0        0 11.65618
6398  chr12:11308221|11308882        0        0          0        1    59624          0        0 11.55784

These circular RNA are derived from genes which are not expressed in disease and only have 1 or 2 back-splice reads in the healthy condition. Could it be a bug? It doesn't look like real splicing change.

Similarly, for increasing order of splicing score

> head(C10)
                   circRNA_ID Case_BSJ Case_FSJ Case_Ratio Ctrl_BSJ Ctrl_FSJ Ctrl_Ratio DE_score   DS_score
4431  chr11:65500651|65503964        1    77803          0        0        0          0        0 -11.919432
3470  chr11:65500476|65500650        1    67608          0        0        0          0        0 -11.749692
10008 chr11:65500057|65500862        1    66250          0        0        0          0        0 -11.690436
2466  chr11:65500228|65506018        1    31021          0        0        0          0        0 -10.567198
8923  chr12:52520035|52520332        1    22792          0        0        0          0        0 -10.149253
5579   chr7:23254169|23270175        1    20406          0        0        0          0        0  -9.999413

If the counts in case is larger than control, shouldn't the score be positive? It is negative.

Kevinzjy commented 4 years ago

Yes, when a circRNA is expressed only in one sample, it's impossible to calculate a DS_score correctly. You can filter out those circRNAs if you want to see the change in junction ratio.