wdecoster / NanoPlot

Plotting scripts for long read sequencing data
http://nanoplot.bioinf.be
MIT License
419 stars 47 forks source link

Request for detailed explanations of numeric values in NanoPlot report #367

Closed KushanManahara closed 4 months ago

KushanManahara commented 4 months ago

I'm a researcher working with nanopore sequencing data, and I have a question regarding the various numeric values reported in the NanoPlot output.

Following is the the NanoPlot report for my data:

General summary:         
Mean read length:                 515.5
Mean read quality:                  7.8
Median read length:               295.0
Median read quality:                8.7
Number of reads:               54,972.0
Read length N50:                  849.0
STDEV read length:                614.5
Total bases:               28,339,974.0
Number, percentage and megabases of reads above quality cutoffs
>Q5:    54126 (98.5%) 28.1Mb
>Q7:    47147 (85.8%) 24.4Mb
>Q10:   8017 (14.6%) 1.3Mb
>Q12:   1814 (3.3%) 0.2Mb
>Q15:   402 (0.7%) 0.0Mb
Top 5 highest mean basecall quality scores and their read lengths
1:  28.2 (26)
2:  27.0 (18)
3:  25.8 (5)
4:  25.8 (11)
5:  25.7 (15)
Top 5 longest reads and their mean basecall quality score
1:  18373 (5.9)
2:  8788 (7.7)
3:  6605 (8.3)
4:  6503 (8.9)
5:  6403 (9.1)

I would appreciate if you could provide detailed explanations for each of these numeric values, including,

  1. The exact formula or equation used to calculate the value.
  2. The underlying data or quality scoring system used as input (e.g., Phred scale for quality scores).
  3. Any specific considerations, assumptions, or edge cases handled during the calculation.
  4. The significance and interpretation of each value in the context of nanopore sequencing data analysis.

Understanding the details behind these numeric values is crucial for accurately interpreting the NanoPlot report and making informed decisions in my research analysis.

I would greatly appreciate if you could provide a comprehensive explanation for each of the numeric values reported by NanoPlot, either in this issue or by pointing me to relevant documentation.

Thank you in advance for your assistance.

wdecoster commented 4 months ago

You can find all of that in the code

KushanManahara commented 4 months ago

Thank you very much @wdecoster !