jguhlin / minimap2-rs

Rust bindings to minimap2 library
Other
63 stars 13 forks source link

Memory leaks #40

Closed Adoni5 closed 1 year ago

Adoni5 commented 1 year ago

Hey @jguhlin

Thanks again for all your work on this! We use it for https://github.com/Adoni5/mappy-rs/. The reason we wrote mappy-rs in the first place was get multi threaded performance in a long running application - namely readfish. With the difference in output on ONTs PromethION, we couldn't keep up just using mappy. Unfortunately I didn't know enough C/C++ to implement this threading as a extension to mappy, and I wanted the opportunity to practice my rust, rather than just multithread in python (perhaps a mistake 😭).

The problem

We noticed creeping memory usage when mapping repeatedly using the same Aligner instance. These graphs are made using memray. The test script I used was https://github.com/Adoni5/mappy-rs/blob/main/tests/memory_test.py. original_mappy_memory

This was true of not running multithreaded as well - this was with a single thread. single_threaded_mappy_rs

After rooting around in PyO3, I became convinced that the problem lies in minimap-rs after running the same multi threaded file, but not actually calling out to minimap2::aligner::map, but instead just returning a Mapping (as defined by mappy-rs). This showed flat memory usage - noop_multi_thread.

You also see this leakage when using minimappers2 in the same way.

I thought the problem might lie in the threadbuffer - as discussed in this minimap2 issue - https://github.com/lh3/minimap2/issues/855. This turned out to help, but we still leaked a lot of memory. The change I made to renew threadbuffer was effectively copied from https://github.com/nanoporetech/bonito/blob/b7074d7c2ae2d781db99c40ba911ee9a671206d7/bonito/aligner.py#L19.

However I eventually used valgrind to test and found leaks in the regs, regs.p and mm_gen_cs. An example (after fixing the first two)

β”‚==772505== 1,551,104 bytes in 6,059 blocks are definitely lost in loss record 418 of 431                                                                                                                                                                                                                            β”‚
β”‚==772505==    at 0x484DA20: realloc (vg_replace_malloc.c:1649)                                                                                                                                                                                                                                                      β”‚
β”‚==772505==    by 0x58AF473: str_enlarge (format.c:16)                                                                                                                                                                                                                                                               β”‚
β”‚==772505==    by 0x58AF473: mm_sprintf_lite (format.c:44)                                                                                                                                                                                                                                                           β”‚
β”‚==772505==    by 0x58AFC93: write_cs_core (format.c:167)                                                                                                                                                                                                                                                            β”‚
β”‚==772505==    by 0x58AFC93: write_cs_or_MD.part.0 (format.c:248)                                                                                                                                                                                                                                                    β”‚
β”‚==772505==    by 0x58AFF07: write_cs_or_MD (format.c:227)                                                                                                                                                                                                                                                           β”‚
β”‚==772505==    by 0x58AFF07: mm_gen_cs_or_MD (format.c:259)                                                                                                                                                                                                                                                          β”‚
β”‚==772505==    by 0x58AFF58: mm_gen_cs (format.c:267)                                                                                                                                                                                                                                                                β”‚
β”‚==772505==    by 0x5899FD8: std::thread::local::LocalKey<T>::with (lib.rs:857)                                                                                                                                                                                                                                      β”‚
β”‚==772505==    by 0x589965B: minimap2::Aligner::map (lib.rs:732)                                                                                                                                                                                                                                                     β”‚
β”‚==772505==    by 0x5829FA4: mappy_rs::Aligner::enable_threading::{{closure}} (lib.rs:706)                                                                                                                                                                                                                           β”‚
β”‚==772505==    by 0x582A958: std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:134)                                                                                                                                                                                                             β”‚
β”‚==772505==    by 0x586E9F2: std::thread::Builder::spawn_unchecked_::{{closure}}::{{closure}} (mod.rs:560)                                                                                                                                                                                                           β”‚
β”‚==772505==    by 0x5879802: <core::panic::unwind_safe::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once (unwind_safe.rs:271)                                                                                                                                                                      β”‚
β”‚==772505==    by 0x585A904: std::panicking::try::do_call (panicking.rs:487)                                                                                                                                                                                                                                         β”‚
β”‚==772505==

By adding calls to free here, as they do in https://github.com/lh3/minimap2/blob/ace990c381c647d6cf8fae7a4941a7b56fb67ae7/python/mappy.pyx#L212-L216, I now no longer see memory leaks!

I'll include a bit more detail after tidying up mappy-rs ( I no longer pass my own CI 😭 ), and will make any changes to this if it fails yours.

Fake minimap2

I changed this script a fair bit to test this without any messing around in PyO3. I think it's all fairly well commented, but if it's broken anything I'm happy to drop these changes, they were purely for testing purposes.

NB I have only tested these changes on Linux 5.19.0-45-generic #46~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Wed Jun 7 15:06:04 UTC 20 x86_64 x86_64 x86_64 GNU/Linux. I have no Idea how/if this will run on aarch64, or how this will play with simd etc.

codecov[bot] commented 1 year ago

Codecov Report

Merging #40 (47bd630) into main (8e4b127) will increase coverage by 0.61%. The diff coverage is 50.00%.

:exclamation: Current head 47bd630 differs from pull request most recent head c0cb0f0. Consider uploading reports for the commit c0cb0f0 to get more accurate results

@@            Coverage Diff             @@
##             main      #40      +/-   ##
==========================================
+ Coverage   49.25%   49.86%   +0.61%     
==========================================
  Files           3        3              
  Lines         735      742       +7     
==========================================
+ Hits          362      370       +8     
+ Misses        373      372       -1     
Impacted Files Coverage Ξ”
src/lib.rs 75.20% <50.00%> (+0.74%) :arrow_up:
jguhlin commented 1 year ago

Hey, awesome work! I'm pretty good at Rust, but my C/C++ and FFI skills are lacking, and this is my first project touching both. I've long suspected there will be one (or more) leaks somewhere.

Thanks for mappy-rs as well, I'm using medaka on a very large genome, and it dies after a few weeks with no output. I'm planning to switch out mappy with yours to get multithreading (to speed up the process and hopefully get to the point of crashing a bit quicker).

I'll plan to get this tested and push out in the next day or two. Cheers :)

Adoni5 commented 1 year ago
but my C/C++ and FFI skills are lacking, and this is my first project touching both

Me too! We never noticed before because I was developing on machines with a couple of hundred Gigabytes of spare RAM. The minimap2-rs and minimap2-sys crates are awesome work, but with not having Rust take care of memory management, it's more complicated. I would love to just directly port minimap2 to Rust but ain't nobody got time for that πŸ˜‚

jguhlin commented 1 year ago

My first (and worst) attempt was to use a transpiler to convert it to rust, but I couldn't follow it all. My current new dream is to decouple the pthreads library so that it will compile on Windows and WASM (I have a pan-genome browser I'm working on; spare-time project, but it would be cool to have mm2 built-in).