Throttle total memory and CPU usage for large WGBS datasets

abearab commented 8 months ago

Hi @FelixKrueger – I'm trying to limit the memory and CPU usage to make sure other labmates can use our shared server and also my job runs at a reasonable speed. I have two questions / concerns:

Why the --parallel option is not actually limiting the number of cores used by bowtie?! I do see that more cores than 8 used!

What options did I need to use to limit the memory? I see a huge amount of memory occupied by a single job.

bismark command

bismark --gzip --parallel 8 --genome genomes/hg38/chromosomes/ -1 70a_R1_val_1.fq.gz -2 70a_R2_val_2.fq.gz

FASTQ file sizes

-rw-rw-r--. 1 aarab aarab  26G Oct 13 16:56 70a_R1_val_1.fq.gz
-rw-rw-r--. 1 aarab aarab  26G Oct 13 16:56 70a_R2_val_2.fq.gz

FelixKrueger commented 8 months ago

This should be described in the --help text, there is an example at the bottom:

Sets the number of parallel instances of Bismark to be run concurrently.

This forks the Bismark alignment step very early on so that each individual Spawn of Bismark processes only every n-th sequence (n being set by --parallel). Once all processes have completed, the individual BAM files, mapping reports, unmapped or ambiguous FastQ files are merged into single files in very much the same way as they would have been generated running Bismark conventionally with only a single instance.

If system resources are plentiful this is a viable option to speed up the alignment process (we observed a near linear speed increase for up to --parallel 8 tested). However, please note that a typical Bismark run will use several cores already (Bismark itself, 2 or 4 threads of Bowtie2/HISAT2, Samtools, gzip etc...) and ~10-16GB of memory depending on the choice of aligner and genome. WARNING: Bismark Parallel is resource hungry! Each value of --parallel specified will effectively lead to a linear increase in compute and memory requirements, so --parallel 4 for e.g. the GRCm38 mouse genome will probably use ~20 cores and eat ~40GB or RAM, but at the same time reduce the alignment time to ~25-30%. You have been warned.

Does this answer your questions?

abearab commented 8 months ago

WARNING: Bismark Parallel is resource hungry!

I have faced this for sure. Yes, --parallel 8 helped to avoid that – as suggested.

Also, I used another server to finish all my samples asap and it seems to be crashed in the middle of the run and ended up with a bam file reported to have 50% mapping efficiency. I re-mapped the same sample in the main server I have access (everything has been working fine there) and it ended up with 80% mapping efficiency. I thought maybe this should be considered that a crashed file should stop the job rather than going up to the very end and merging them as an incorrect final file. I can imagine this might be hard to fix, but just wanted to share my experience here.

Regardless of that, I'm almost done with this set of samples I'm processing, thanks for the feedbacks here :)

(closing this issue here)

FelixKrueger / Bismark

Throttle total memory and CPU usage for large WGBS datasets #634