bioinfo-biols / CIRIquant

circular RNA quantification tools
https://sourceforge.net/projects/ciri/files/CIRIquant
MIT License
27 stars 17 forks source link

Read count cut-off for circRNA differential expression #21

Closed prisca399 closed 3 years ago

prisca399 commented 3 years ago

Hi @Kevinzjy,

Thanks for creating this awesome tool. I have been able to implement the CIRIquant command successfully using circRNA predictions results from other tools. I am now interested in differential expression. Would you advise filtering the circRNAs first (i.e. between the quantification and differential expression steps) to get rid of those with low read counts as quantified using CIRIquant? If not, can you explain why as well? Thanks!

Prisca

Kevinzjy commented 3 years ago

Hi @prisca399 , normally, we will filter out circRNAs with only 1 supporting read before differential expression analysis. There's no particular cut-off in this step, if you want to focus on highly expressed circRNAs, I would suggest you using a threshold of 2/5/10 BSJ reads for at least 1 sample, or tuning your own threshold according to the number of circRNAs left after filtering (e.g. top 500/1000).

prisca399 commented 3 years ago

Hi @Kevinzjy,

I am interested in filtering the main gtf output such that only circRNAs with a read count >5 will be used for differential expression. I want to ask whether it is also necessary to alter the metadata at the top of the gtf files, which lists the number of mapped reads, bsj reads, etc. If so, can you clarify which aspects should be changed and in what way?

Kevinzjy commented 3 years ago

It depends.

prisca399 commented 3 years ago

Thank you for the clarification. I have multiple biological replicates consisting of tumor and normal samples like the ones you demonstrated in your paper. To confirm--you are suggesting that I perform the filter on the circRNA_bsj.csv file and not on the sample.gtf file? And I would filter out circRNAs that have less than a sum of five supporting reads across all samples? I have 24 samples total that I am comparing, 15 tumor and 9 control.

prisca399 commented 3 years ago

I was able to easily subset the bsj.csv (after running prep_CIRIquant) as you mentioned and as I interpreted above. I found no need to edit the metadata of the gtf file. Thanks!

Kevinzjy commented 3 years ago

Thanks for letting me know.