Ecogenomics / BamM

Metagenomics-focused BAM file manipulation
http://ecogenomics.github.io/BamM/
GNU Lesser General Public License v3.0
16 stars 7 forks source link

Sort cpus #26

Closed wwood closed 9 years ago

wwood commented 9 years ago

Turns out sorting 250GB BAM files can be very slow, and you really do need lots of CPUs. Before, samtools sort was not multithreaded.

The most recent patch assumes that bwa and samtools sort don't often work at the same time, so it is safe to reuse the CPUs between these two phases when making (and at worst causes a little kernel scheduling time loss).

Also here merged is the next branch, which brings a new variance coverage mode, and some better documentation, among other things.

wwood commented 9 years ago

Just changed to to not use a tempdir for bamm make, because this may have resulted in /tmp filling up in some cases. Better for it to just work than to run faster but fail in some circumstances.

I'll merge in the next day or so unless there is comment.

minillinim commented 9 years ago

No comment! On 03/07/2015 9:33 am, "Ben J Woodcroft" notifications@github.com wrote:

Just changed to to not use a tempdir for bamm make, because this may have resulted in /tmp filling up in some cases. Better for it to just work than to run faster but fail in some circumstances.

I'll merge in the next day or so unless there is comment.

— Reply to this email directly or view it on GitHub https://github.com/Ecogenomics/BamM/pull/26#issuecomment-118193158.