bioinfo-biols / CIRIquant

circular RNA quantification tools
https://sourceforge.net/projects/ciri/files/CIRIquant
MIT License
27 stars 18 forks source link

Very low numbers for circular RNAs #34

Open FerallOut opened 2 years ago

FerallOut commented 2 years ago

Hey,

I am working on a dataset from an experiment that is specifically designed to induce circRNAs. But when I look in the 'library_info.csv', the number of circular RNA reads are very small.

Sample,Total,Mapped,Circular,Group,Subject S1,55730760,51498102,5530,T,1 S2,66246882,61594648,6730,T,2 S3,40042314,37079726,3940,T,3 Cl1,57359160,53514152,8206,T,1 Cl2,58229712,53897216,7804,T,2 Cl3,45110986,41606718,5644,T,3

If I look in the 'gene_count_matrix', I get numbers like: head gene_count_matrix.csv gene_id,S1,S2,S3,Cl1,Cl2,Cl3 ENSG00000132680|KHDC4,7883,9307,5896,2073,2154,1356 ENSG00000145041|DCAF1,3921,4567,2998,11977,12348,8927

I am not sure what I am misunderstanding in this regards. It seems that I am losing the circular RNAs at some level, but I am not sure where.

Kevinzjy commented 2 years ago

Hi @FerallOut , what's your sequencing protocol? The percentage of BSJ reads is approximately 0.01% in your data, and it's a quite reasonable number for many total RNA-seq libraries without RNase R treatment.

FerallOut commented 2 years ago

I didn't use RNase R treatment, so I understand that the percentage of circular RNAs should be low.

But then maybe I am misunderstanding how I should interpret the output of the CIRIquant. Does 'gene_count_matrix' corresponds to the number of reads that map to each gene in general or corresponds to the number of reads that map to circular RNAs within those genes?

Kevinzjy commented 2 years ago

The gene_count_matrix is the number of reads mapped to each gene computed by StringTie.