wdecoster / NanoPlot

Plotting scripts for long read sequencing data
http://nanoplot.bioinf.be
MIT License
413 stars 47 forks source link

Help with plot descriptions #261

Closed utritala closed 3 years ago

utritala commented 3 years ago

Hi, Why is the dynamic read length histogram plotted for a downsampled number of reads? What are the read lengths normalised by in the "Normalised histogram of read lengths" plot? I am using NanoPlot v1.34.1 Thanks.

wdecoster commented 3 years ago

Downsampling is done to make sure plots are generated quickly, load quickly in your browser, and don't take up too much space on your hard drive or as an attachment. By downsampling to 10000 reads, we believe the distribution will still be representative of the full dataset.

Where did you see the Normalised histogram of read lengths? In NanoComp? Or do you mean the 'weighted' histograms?

utritala commented 3 years ago

Thank you so much for your prompt response. Very helpful!

Sorry, yes, I meant Normalised histogram of read lengths in NanoComp.

Thanks again.

On Wed, May 12, 2021 at 1:25 PM Wouter De Coster @.***> wrote:

Downsampling is done to make sure plots are generated quickly, load quickly in your browser, and don't take up too much space on your hard drive or as an attachment. By downsampling to 10000 reads, we believe the distribution will still be representative of the full dataset.

Where did you see the Normalised histogram of read lengths? In NanoComp? Or do you mean the 'weighted' histograms?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/wdecoster/NanoPlot/issues/261#issuecomment-839728622, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMLNGFCN4CCN3XC6RBYZXILTNJXUPANCNFSM44YLJ2VQ .

wdecoster commented 3 years ago

The normalised histogram in Nanocomp corrects for differences in the number of reads between the datasets. If one dataset has ten times more reads as the other dataset it becomes harder to look at differences in library length, unless you normalise.

utritala commented 3 years ago

Gotcha! Thanks again. :)

On Wed, May 12, 2021 at 1:53 PM Wouter De Coster @.***> wrote:

The normalised histogram in Nanocomp corrects for differences in the number of reads between the datasets. If one dataset has ten times more reads as the other dataset it becomes harder to look at differences in library length, unless you normalise.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/wdecoster/NanoPlot/issues/261#issuecomment-839747374, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMLNGFBMO3626OIXUK2ORYTTNJ24FANCNFSM44YLJ2VQ .

wdecoster commented 3 years ago

Glad to help, let me know if anything else is unclear. I should really update the documentation :-(