Xinglab / rmats-turbo

Other
233 stars 55 forks source link

JCEC and JC in "summary.txt" #374

Open luojinqin opened 8 months ago

luojinqin commented 8 months ago

Hello, your software is doing a great job and I am very grateful for developing such a good tool. I am currently using this tool for alternative splicing analysis. I have a question about the results of rmats analysis. I tried several sets of data and found that the difference between TotalEventJC and TotalEventJCEC in "summary.txt" is very small. However, in my understanding, JCEC should be much larger than JC. Because I think the number of reads falling on exons should be much greater than the number of reads falling on junctions, so JCEC should be much larger than JC. Could you please explain the reason for this, or is my understanding incorrect? Thank you. Here are the data from the summary.txt files I obtained by running rmats with two different sets of data. image image

EricKutschera commented 8 months ago

TotalEventsJC and TotalEventsJCEC are both reporting the number of events in the final MATS output files. Those MATS output files are filtered to events with at least 1 supporting read for each sample group and 1 supporting read for each isoform unless --statoff was used: https://github.com/Xinglab/rmats-turbo/blob/v4.2.0/rmats.py#L336

TotalEventsJCEC can be higher since TotalEventsJC is based on the output file that only includes junction reads while TotalEventsJCEC is based on the output file that also includes exon reads. The difference in those two totals is the number of events which required an exon read to pass the filter. If the difference is small then for most events there was at least 1 junction read to support each sample group and each isoform. A small difference seems reasonable since the filter can be passed with just 2 junction reads