Closed CodingKaiser closed 3 years ago
The number of bins is chosen by RTA. On some very early platforms you do get both full resolution and binned data. On later platforms we moved to binned data (~7 bins). On the most recent platforms, e.g. NovaSeq, you only get 3 bins.
We made this change to 1) save disk space on the instrument and 2) variant callers don't significantly benefit from additional bins.
That last picture with the green bar overlapping the dark green bar is not showing overlapping bins. The skinny green bar indicates where the Q30 threshold is (more useful if you have more than 3 bins).
This may be buried deep in the documentation somewhere, but how would one go about increase or decreasing the number of bins displayed in the q-score histogram?
I am going through the Python notebook provided to generate the q-score histogram, and the bar chart it spits out has a nice fine-grain view of the distribution of q-scores.
However, running this on my own sequencing run results in a much coarser view of the distributions, yielding only a total of 3 bins, with one directly overlapping one of the bins, which is not ideal.
I presume setting some variable in the object returned by
run_metrics.q_metric_set()
should do the trick, but so far I have been unable to find the corresponding setting. Any help would be greatly appreciated!