Closed BerrocalRubio333 closed 3 months ago
Hi,
Thanks for your question. For every read, the average quality is calculated by converting phred-scale scores into error probabilities, taking the average, and converting it back to phred-scale. This is different from how other tools do it, but correct, in my opinion. For those per-read averages, you get the overall median or the mean in the NanoStat report. And for the quality cut-offs and the kde plot the per-read averages are used.
Hope that helps!
Wouter
Hi @wdecoster is there a way to get the range of reads min-max in the NanoStats output? I only see these stats:
General summary:
Mean read length: 282.7
Mean read quality: 9.9
Median read length: 249.0
Median read quality: 10.6
Number of reads: 1,518,294.0
Read length N50: 254.0
STDEV read length: 137.7
Total bases: 429,214,286.0
Number, percentage and megabases of reads above quality cutoffs
Q5: 1518292 (100.0%) 429.2Mb Q7: 1516705 (99.9%) 428.8Mb Q10: 994193 (65.5%) 277.6Mb Q12: 294118 (19.4%) 80.5Mb Q15: 12671 (0.8%) 3.7Mb Top 5 highest mean basecall quality scores and their read lengths 1: 38.0 (1) 2: 36.0 (1) 3: 34.0 (1) 4: 34.0 (1) 5: 33.0 (1) Top 5 longest reads and their mean basecall quality score 1: 18248 (11.4) 2: 14380 (10.9) 3: 7936 (9.1) 4: 7587 (9.1) 5: 7557 (9.5)
This doesn't seem related to the initial question here. Opening a separate issue would be more appropriate if that is the case. I also don't understand what you are missing from the summary - do you mean you want to know the shortest and longest read in the experiment?
Hi Wouter and all,
I am trying to understand specifically the outputs of Nanoplot. Could you please share with me:
-In the Stat Summary, when you show quality cut-offs, are this calculated based on the median or the mean of a read?
Best.
Miguel