Open rhpvorderman opened 3 years ago
I also encountered the same problem. The memory consumption is still too large for indexing large genomes. Hope it can be resolved.
@rhpvorderman @WANGchuang715 we are looking into this, hopefully, we will have a solution soon.
bug report i tried bwa-mem2 on my HPC the memory usage is huge there must some bug in this app
bwa-mem2 version 2.0pre2
index
bwa-mem2 index my.fa
memory_usage: ~450Gb
bwa-mem2.avx2
mem
bwa-mem2 mem -t 8 my.fa f.fq r.fq
memory_usage: ~300Gb and still not loading over the reference
bwa-mem2.avx2
You used an older release. Please try with the latest release v2.1 and check. The index size is reduced in the latest release. We are looking into reducing the memory requirement during indexing.
I am bitten by this issue too. I can not index my 18Gbp genome assembly on a 512GB RAM node.
Big +1 for this feature :) bwa-mem2
is awesome -- would love to be able to use it for my use case but not enough RAM at the moment
meet this issue too, with version 2.2.1.
binary seq ticks = 361627464193 Allocation of 92.04 GB for suffix_array failed. Current Allocation = 103.54 GB
Hi!
First of all, thanks for the latest bwa-mem2 2.1 release. It works great. The reduced memory usage is fantastic. It allowed me to run benchmarks locally as alignment on 8 threads for hg38 + alt + decoy sequences used only 19 GB. Also this means that performance is less susceptible to Non-Uniform Memory Access (NUMA) which is a problem on multi-socket servers. Less memory means better performance! The used sequence can be found here and was 3,1GB.
The indexing step however still takes about 80G. This means it was not possible to run it locally. Since I have access to a compute cluster, this was not a problem for me, as I could transfer the index to my local (32GB RAM) machine after building it. However this is not possible for people who do not have access to a compute cluster.
Would there be some low-hanging to reduce the
28 x <size_of_reference
requirement? The indexing runtime was very good, just 50 minutes. But since this is a step that is only run once, I think a lot of people can be made happy with a doubling of the runtime if the memory can be halved. It will make using bwa-mem2 more accessible for institutions that do their compute workload on workstation class PCs instead of having a compute cluster.