Open wjones127 opened 9 months ago
Just did some quick performance tests, and it seems like overall jemalloc would be an improvement in our current benchmark suite.
In `Cargo.toml`: ```toml # Allocators mimalloc = { version = "0.1.39", optional = true } jemallocator = { version = "0.5.4", optional = true, features = ["disable_initial_exec_tls"] } snmalloc-rs = { version = "0.3.4", optional = true, features = ["local_dynamic_tls"] } ``` In `python/src/lib.rs`: ```rust // Set global allocator #[cfg(feature = "mimalloc")] #[global_allocator] static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc; #[cfg(feature = "snmalloc-rs")] #[global_allocator] static GLOBAL: snmalloc_rs::SnMalloc = snmalloc_rs::SnMalloc; #[cfg(feature = "jemallocator")] #[global_allocator] static GLOBAL: jemallocator::Jemalloc = jemallocator::Jemalloc; ```
On MacOS, I find jemalloc is good in almost all situations, except for some regressions in writes:
On Linux, it's also generally good though there are some regressions for scans, which might be worth looking into:
Overall, I'm inclined to wait until we understand why the regressions on Linux scans occurs.
We do a lot of
take
andcompact_batches
, which involves moving data to new buffers. It might be worth seeing if a memory pool like jemalloc would help here. This would specifically be for Python; I don't think we should make this a dependency for the Rust crate.The tool https://github.com/koute/bytehound might come in handy.