s-andrews / SeqMonk

SeqMonk NGS visualisation and analysis tool
GNU General Public License v2.0
47 stars 9 forks source link

Basic histograms for multiple datasets #1

Open ewels opened 7 years ago

ewels commented 7 years ago

The three histograms are super useful (Probe Value Histogram, Read Length Histogram, Probe Length Histogram), but I often find myself filling my screen with lots of windows trying to compare different datasets.

Could there be an option to plot these histograms for multiple datasets in a single graph?

First issue!

s-andrews commented 7 years ago

We could add this as an option but the problem is that it doesn't scale well. We hit a similar issue with the duplication plots, which we kludged around by allowing some multiplicity, but limiting it so that large projects don't produce stupid output. It would take a bit of work to make this happen - mostly to make the scales on the plot synchronise, but after that it's not too bad.

This same issue is why we created the cumulative distribution plot and the beanplots - both of which solve the same basic problem as the values histogram in a way which scales much better. Do either of these not do what you want?

ewels commented 7 years ago

Yup, I was imagining overlaid lines like the cumulative distribution plot. This and the beanplot work well, but the one I'm filling my screen with currently is the read length plot..

s-andrews commented 7 years ago

So what you really want is a kernel density plot for all of the cases where we currently offer a histogram?

ewels commented 7 years ago

I was thinking of bar outlines from the same histogram, but yes a density plot would be nicer.

This seems to be another one of those feature requests that I expect to be really simple and somehow spirals rapidly towards something much more complicated-sounding..!

s-andrews commented 7 years ago

Guess what. I'm just working on a very large dataset where it would be really useful to be able to see the distributions of read lengths across all samples.

Someone should write a visualisation to allow that to happen....