biod / sambamba

Tools for working with SAM/BAM data
http://thebird.nl/blog/D_Dragon.html
GNU General Public License v2.0
555 stars 104 forks source link

sambamba merge using too much memory #470

Closed Poshi closed 3 years ago

Poshi commented 3 years ago

In a recent run of sambamba merge (0.7.0) I observed an incredibly large memory usage for what should be virtually zero. According to SLURM accounting, an execution took 2572s and used 6.33GiB of memory.

That could make sense if it were a sort, but it was just a merge of already sorted inputs, so the theoretical minimum memory need is just one record per input file. There were only 4 input files and the output file size is 182GiB (compression level 2).

How much memory should I allocate when running a merge? A function of the input size? A fixed quantity? Is the merge tool leaking memory? Or there's some reason why it is using this much memory?

Poshi commented 3 years ago

OK, I will look for a Google group, but it seems to me an issue/bug. I'm not asking for support, but for a resolution.