gt1 / biobambam2

Tools for early stage alignment file processing
Other
93 stars 17 forks source link

Any Option for bamsort in Parallel? #59

Open spikeliu opened 6 years ago

spikeliu commented 6 years ago

For "bamsort" command, I cannot find an option for using multiple threads in sorting process (I don't think the inputthreads and outputthreads are for this kind of purpose). And when I look at the source code, I can see an option "KEY" as "sortthreads" which doesn't show up in the --help information. I wonder if I can use it or is there any reason that you hide it. Maybe it is not ready to be used because it can cause some error?

mmokrejs commented 6 years ago

It works quite nicely. See https://github.com/ablab/spades/issues/67#issuecomment-359267438 for 3 shell lines and an example performance on LustreFS. Best is to have a ramdisk on the machine, write the sorted BAM file to it and then, move the final BAM with its index to the real storage filesystem.

gt1 commented 6 years ago

The sortthreads option should work. I will add it to the documention in the next version.

mmokrejs commented 6 years ago

So how does sortthreads relate to inputthreads and outputthreads if I use all three on the commandline? In which ratio should I distribute the available cores in between these three?

gt1 commented 6 years ago

@mmokrejs: the issue with bamsort is that it does not use pooled threading throughout the program. The input, output and sortthreads may run all at the same time. You can use a tool like cpuset to limit the number of real cores used by the program and set all three to that number of threads. If you want a program that will use exactly a given number of threads for processing at any time, then please check bamsormadup, it was designed for this.

mmokrejs commented 6 years ago

@gt1 You say that if I run bamsort sortthreads=$phys_cores inputthreads=$phys_cores outputthreads=$phys_cores that I will end-up with load 300?

Shall I divide the numbers of available cores by 3 to ensure the load will be max 100?

But isn't one decompression and one compression thread enough? So bamsort sortthreads=$phys_cores-2 inputthreads=1 outputthreads=1 ?

gt1 commented 6 years ago

@mmokrejs This could happen, although it is rather unlikely. In my experience, assuming you do not set level=0 for uncompressed output, the output compression is rather compute heavy, so you might want to spend more threads there.