Closed prisca399 closed 3 years ago
Hi @prisca399 , normally, we will filter out circRNAs with only 1 supporting read before differential expression analysis. There's no particular cut-off in this step, if you want to focus on highly expressed circRNAs, I would suggest you using a threshold of 2/5/10 BSJ reads for at least 1 sample, or tuning your own threshold according to the number of circRNAs left after filtering (e.g. top 500/1000).
Hi @Kevinzjy,
I am interested in filtering the main gtf output such that only circRNAs with a read count >5 will be used for differential expression. I want to ask whether it is also necessary to alter the metadata at the top of the gtf files, which lists the number of mapped reads, bsj reads, etc. If so, can you clarify which aspects should be changed and in what way?
It depends.
CIRI_DE_replicate
. No need to alter the metadata of the GTF output.CIRI_DE
. Then I don't think it's a good idea to filter these circRNAs before differential analysis. I would recommend you running CIRI_DE
with all circRNAs generated from the last step, and filter the DE results instead. (The calculation of DE_score and DS_score largely relies on the number of mapped reads and BSJ reads, and changes in these numbers would alter the differential expression results in some unexpected ways.) Thank you for the clarification. I have multiple biological replicates consisting of tumor and normal samples like the ones you demonstrated in your paper. To confirm--you are suggesting that I perform the filter on the circRNA_bsj.csv file and not on the sample.gtf file? And I would filter out circRNAs that have less than a sum of five supporting reads across all samples? I have 24 samples total that I am comparing, 15 tumor and 9 control.
I was able to easily subset the bsj.csv (after running prep_CIRIquant) as you mentioned and as I interpreted above. I found no need to edit the metadata of the gtf file. Thanks!
Thanks for letting me know.
Hi @Kevinzjy,
Thanks for creating this awesome tool. I have been able to implement the CIRIquant command successfully using circRNA predictions results from other tools. I am now interested in differential expression. Would you advise filtering the circRNAs first (i.e. between the quantification and differential expression steps) to get rid of those with low read counts as quantified using CIRIquant? If not, can you explain why as well? Thanks!
Prisca