umccr / RNAsum

Pipeline for generating RNAseq-based cancer patient reports
https://umccr.github.io/RNAsum/
Other
7 stars 4 forks source link

Inconsistency in "library size" info #165

Open JMarzec opened 2 months ago

JMarzec commented 2 months ago

There seem to be inconsistency in the patient sample "library size" between the value reported in the report header (at the top, next to the sample name) and the value plotted in the "Library size" violin plot (marked by blue horizontal line, within the "Input data summary" section). In attached example (report screenshot) the reported Library size is 118.8 M reads, while on the plot it's marked at the value of 30 (the y-axis).

SBJ03172 RNAseq_report
pdiakumis commented 2 months ago

Thanks Jacek! I think those two are calculated/pulled out differently. Let's look at how this was in the old master:

And now:

skanwal commented 2 months ago

The one in the title reflect total number of reads. This is taken from a Dragen metrics file. The barplot on the other hand uses sum of reads/1e6 for the calculations.

Think that's fine - the latter one is based on RNAseq analysis recommendations, I believe. The value in the header was included for the curation team to have an estimation of total read counts.