Closed leoisl closed 1 year ago
Going further with another optimisation (from https://github.com/leoisl/pandora/commit/72cd6a0749b3b232634e05d21c34c9cf4014d875 to https://github.com/leoisl/pandora/commit/1c53eb78a70be75dd35601701e3952d75549daa9), where we sort minimiser hits not by their location in the PRG string (which is a quite heavy object) but by their kmer node id, which corresponds to the order of the node in the minimizer DAG. Algorithm-wise, when we map minimizers from reads to PRGs, we need to sort the hits. This specific sort (using the location in the PRG string) just plays a role in a specific case, when we map a read minimizer to a graph that has such minimizer duplicated in two or more places. In this case, we would sort these hits further by their location in the PRG string, but now has been changed to be sorted by the order of the minimizer in the DAG. These two sorts are actually somewhat related, as minimizers that happen earlier in the PRG string have lower id in the minimizer DAG. My personal opinion is that it should not change much the results, and the following 4-way results confirm this. RAM improvement is good, 2.4x less RAM than the previous optimisation (b19d26
), allowing us to run roundhound with <10GB. RAM improvements will be detailed in a future post, I am still gathering benchmarks.
Detailed 4-way results follow, only for filtered data. Here we compare the latest release (0.10.0-alpha.0), the version described in the previous post (b19d26
, with lazy loading and read data optimisation), and this version under study (1c53eb
, which adds an index optimisation):
The most improved version, 1c53eb
, actually slightly improves precision for the illumina results, both with and without denovo.
The curves for both improved versions, b19d26
and 1c53eb
basically overlap, which means that the index optimisation done in 1c53eb
do not introduce any bugs:
The previous post shows that the improvements we've done do not introduce bugs to pandora and we can thus merge. The merge will consist of the 5 PRs (the 1st one is large, the other are small increments):
I've also removed RAM values from the previous posts, and I am gathering benchmarking data on how all these improvements reduced RAM. I will update this issue as soon as I get all benchmarking data.
History of RAM and runtime improvements for the new version of pandora that will be merged in the next PRs. These benchmarks were done running pandora compare
with the RH plasmid DB (~1M PRGs) and the ESBL sample SRR16977031
:
v0.10.0-alpha.0
(baseline, current release)
RAM usage: 178.1 GB
Runtime: 130 minutes
commit a76df4
(only lazy loading added - this is the version we've been using in RH, unreleased):
RAM usage: 124.5 GB (30% less RAM than baseline)
Runtime: 31.8 minutes (4 times faster than baseline)
commit b19d26
(lazy loading + read info optimisation, unreleased):
RAM usage: 22.1 GB (88% less RAM than baseline)
Runtime: 13 minutes (10 times faster than baseline)
commit 1c53eb
(lazy loading + read info optimisation + index optimisation, unreleased):
RAM usage: 9.1 GB (95% less RAM than baseline)
Runtime: 8.35 minutes (15.5 times faster than baseline)
Thus when finishing all merges, we will have a version that requires 95% less RAM than current release (~20x improvement on RAM usage) and runs 15.5 times faster than current release.
LSF logs follow:
Pandora benchmarking:
1c53eb (lazy loading + reads optimisation + paths optimisation):
Resource usage summary:
CPU time : 4764.10 sec.
Max Memory : 9345 MB
Average Memory : 7826.94 MB
Total Requested Memory : 80000.00 MB
Delta Memory : 70655.00 MB
Max Swap : -
Max Processes : 4
Max Threads : 20
Run time : 501 sec.
Turnaround time : 511 sec.
b19d26 (lazy loading + reads optimisation):
Resource usage summary:
CPU time : 4771.61 sec.
Max Memory : 22644 MB
Average Memory : 19645.91 MB
Total Requested Memory : 80000.00 MB
Delta Memory : 57356.00 MB
Max Swap : -
Max Processes : 4
Max Threads : 20
Run time : 781 sec.
Turnaround time : 854 sec.
a76df4 (only lazy loading - version we've been using in RH):
Resource usage summary:
CPU time : 13056.05 sec.
Max Memory : 127450 MB
Average Memory : 97789.90 MB
Total Requested Memory : 150000.00 MB
Delta Memory : 22550.00 MB
Max Swap : -
Max Processes : 4
Max Threads : 20
Run time : 1909 sec.
Turnaround time : 1911 sec.
v0.10.0-alpha.0 (baseline):
Resource usage summary:
CPU time : 26317.66 sec.
Max Memory : 182410 MB
Average Memory : 84832.14 MB
Total Requested Memory : 1024000.00 MB
Delta Memory : 841590.00 MB
Max Swap : -
Max Processes : 4
Max Threads : 20
Run time : 7773 sec.
Turnaround time : 7782 sec.
bloody hell @leoisl
for future readers, RH=roundhound.
FAR OUT 🔥
These results are unbelievable!! Amazing
Closed via #331, #337, #342 and #345
Description
This is just a logging for the new upcoming PR that will have 2 major changes in pandora:
And also minor changes:
Results
The major changes should not impact results as they are just RAM improvements. The multimapping improvement should change the results slightly, but hopefully for better. To check if any breaking bug was added, we ran this version against the most updated prerelease on the 4way pipeline. In general, the new version is slightly better precision-wise without denovo, and the old version is slightly better precision-wise with denovo. The differences are however small. RAM improvements are massive and will be detailed in a later post. This will enable pandora to be run with far less computational resources, and it will also speed up the next feature, which is running it on the cluster on hundreds of samples, so I think it is worth to merge these improvements.
Details
Detailed 4-way results follow
Illumina filtered:
Illumina unfiltered:
Nanopore filtered:
Nanopore unfiltered: