smarco / WFA2-lib

WFA-lib: Wavefront alignment algorithm library v2
Other
157 stars 35 forks source link

Memory usage for ultra long read alignment #65

Closed Daniel-Liu-c0deb0t closed 1 year ago

Daniel-Liu-c0deb0t commented 1 year ago

@RagnarGrootKoerkamp and I have observed the memory usage of BiWFA increasing significantly to >10GB when aligning >500kbp ONT reads with linear and affine costs (eg., sub = 1, open = 1, extend = 1). We are currently using Rust bindings here (using cmake to build with flags recommended by the WFA2-lib readme) and we have observed this memory usage issue on both the latest main branch and a commit from around 1 year ago. We have reproduced this on both MacOS and Linux.

The Rust wrapper is quite thin so we think it's unlikely that its the cause of the issue. Any idea what's wrong?

RagnarGrootKoerkamp commented 1 year ago

Note: These are the same reads as the ONT UL in figure 2 of the BiWFA paper, so this is likely some regression with the code at some point. With unit costs (sub=extend=1, open=0), memory usage is very small as expected.

RagnarGrootKoerkamp commented 1 year ago

I ran it through heaptrack, and unsurprisingly, all the memory comes from wavefront_allocate via mm_allocator_segment_new

image

smarco commented 1 year ago

Hi,

Thanks for the report, guys. I will have a look. Note that, if you see the "unialign" functions on the profile, it means that you are using regular WFA. The BiWFA uses all the "bialign" modules.

In any case, I will have a look.

Thanks,

Daniel-Liu-c0deb0t commented 1 year ago

That prompted me to take another look at the bindings and I see the issue: https://github.com/pairwise-alignment/rust-wfa2/blob/main/src/aligner.rs#L321 Some of the parameters (memory model and alignment scope) are being overwritten by this default() statement.

Daniel-Liu-c0deb0t commented 1 year ago

Yeah that seems to be it. Sorry for bothering you!

smarco commented 1 year ago

Anytime, Daniel. Let me know how it goes.