lancedb / lance

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, with more integrations coming..
https://lancedb.github.io/lance/
Apache License 2.0
3.65k stars 195 forks source link

perf: investigate using jemalloc #1372

Open wjones127 opened 9 months ago

wjones127 commented 9 months ago

We do a lot of take and compact_batches, which involves moving data to new buffers. It might be worth seeing if a memory pool like jemalloc would help here. This would specifically be for Python; I don't think we should make this a dependency for the Rust crate.

The tool https://github.com/koute/bytehound might come in handy.

wjones127 commented 8 months ago

Just did some quick performance tests, and it seems like overall jemalloc would be an improvement in our current benchmark suite.

Changes for benchmark

In `Cargo.toml`: ```toml # Allocators mimalloc = { version = "0.1.39", optional = true } jemallocator = { version = "0.5.4", optional = true, features = ["disable_initial_exec_tls"] } snmalloc-rs = { version = "0.3.4", optional = true, features = ["local_dynamic_tls"] } ``` In `python/src/lib.rs`: ```rust // Set global allocator #[cfg(feature = "mimalloc")] #[global_allocator] static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc; #[cfg(feature = "snmalloc-rs")] #[global_allocator] static GLOBAL: snmalloc_rs::SnMalloc = snmalloc_rs::SnMalloc; #[cfg(feature = "jemallocator")] #[global_allocator] static GLOBAL: jemallocator::Jemalloc = jemallocator::Jemalloc; ```

On MacOS, I find jemalloc is good in almost all situations, except for some regressions in writes:

Screenshot 2023-11-11 at 11 46 57 AM

On Linux, it's also generally good though there are some regressions for scans, which might be worth looking into:

Screenshot 2023-11-11 at 11 59 52 AM

https://docs.google.com/spreadsheets/d/1hfPkVX0bkZTcWm1jSBdxWYl5zo24F2YfPqh98NYYY0w/edit#gid=1933822679

Overall, I'm inclined to wait until we understand why the regressions on Linux scans occurs.