gt1 / biobambam2

Tools for early stage alignment file processing
Other
93 stars 17 forks source link

Controlling number of temporary files for bamsormadup #67

Open chapmanb opened 6 years ago

chapmanb commented 6 years ago

German; @ameynert reported an issue using bamsormadup for a deep WGS sample within bcbio. We're running bamcat and bamsormadup on a split set of alignment BAMs to generate a final BAM:

bamcat level=0  `cat WW00263a-sort.list` | bamsormadup threads=7  > /WW00263a-sort.bam

and end up creating 5412 temporary files for merging which outstrips the hard open file handle limits on the machine. This is the full traceback for the run:

https://gist.github.com/ameynert/e3192cc07ef3010b9dea030a3d0c9461

Is it possible to control temporary file creation (write fewer bigger files) to help work around this, or do you any other tips or ideas to try?

Thanks as always for your great work on biobambam.