PSeitz / lz4_flex

Fastest pure Rust implementation of LZ4 compression/decompression.
MIT License
441 stars 28 forks source link

Shrink vec after block compression #98

Closed CosmicHorrorDev closed 1 year ago

CosmicHorrorDev commented 1 year ago

I was a bit surprised that my reported memory usage was higher after compressing a bunch of highly compressible in-memory data instead of smaller. Took a bit of digging to realize that the lz4_flex::compress family of functions allocate a buffer for the maximum possible output size

demo

const MIB: f32 = 1_024.0 * 1_024.0;

fn main() {
    let input = vec![0; 100 * 1_024 * 1_024];
    let compressed = lz4_flex::compress(&input);

    // Prints: len 0.39 MiB - cap 110.00 MiB
    // 😱
    println!(
        "len {:.2} MiB - cap {:.2} MiB",
        compressed.len() as f32 / MIB,
        compressed.capacity() as f32 / MIB
    );
}

This change just shrinks the vec before returning

It's a bit unfortunate that compressing allocates such a large buffer upfront. I get that on Linux (and Mac I think) this won't really allocate all the extra memory since zeroed pages aren't actually allocated till they're used. On Windows though I believe that this will allocate everything upfront (although I haven't tested). It's not that big of a deal for me though since I just switched to using the LZ4 Frame Format instead (which works great!)

codecov[bot] commented 1 year ago

Codecov Report

Merging #98 (8a2bc71) into main (be2ea76) will not change coverage. The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main      #98   +/-   ##
=======================================
  Coverage   89.39%   89.39%           
=======================================
  Files          11       11           
  Lines        2179     2179           
=======================================
  Hits         1948     1948           
  Misses        231      231           
Impacted Files Coverage Δ
src/block/compress.rs 98.11% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

PSeitz commented 1 year ago

The block format is not really the right choice for data that large, maybe that should be clearer in the docs. The global compress function should be removed, or replaced to use the frame format.

The allocation upfront is done for performance reason, without it it would significantly slower.

CosmicHorrorDev commented 1 year ago

The block format is not really the right choice for data that large, maybe that should be clearer in the docs.

Ahhh, that would explain my confusion 😅

I was using it for in-memory data that would regularly get into the 10s of MiBs until I checked a memory profile

The global compress function should be removed, or replaced to use the frame format.

This was part of my confusion as well. I was using them because it was more convenient, and I figured that it was the preferred format if it was in the crate root


Thanks for all the feedback! I've got some more API considerations, but I'll open a new issue tomorrow since I'm off to bed