dropbox / rust-brotli

Brotli compressor and decompressor written in rust that optionally avoids the stdlib
https://dropbox.tech/infrastructure/-broccoli--syncing-faster-by-syncing-less
BSD 3-Clause "New" or "Revised" License
810 stars 83 forks source link

Multithreaded compression #96

Open Smotrov opened 1 year ago

Smotrov commented 1 year ago

It was mentioned in the announcement that "Multithreaded compression allows multiple threads to operate in unison on a single file." However, I've been struggling to find any documentation or examples that illustrate this feature. I would greatly appreciate it if someone could share a code snippet or example showing how to utilize this new functionality. I suspect it is compress_multi, but the last parameter alloc_per_thread is unclear for me.

danielrh commented 4 months ago

You should pass an allocator per thread so that it can allocate from each thread. If your allocator is thread safe then you can share the same one.

Smotrov commented 4 months ago

Thank you @danielrh!

I'm not sure if this is correct approach. Meanwhile I do not have a clear understanding of SliceWrapper purpose and second parameter of SendAlloc::new.

I also have 2 questions:

use brotli::enc::multithreading::compress_multi;
use brotli::enc::threading::SendAlloc;
use brotli::enc::backward_references::BrotliEncoderParams;
use brotli::enc::{StandardAlloc, Owned};
use brotli::enc::UnionHasher;

// Adjust number of threads
const CPU_CORES: usize = 16;

struct MySliceWrapper(Vec<u8>);

impl brotli::enc::SliceWrapper<u8> for MySliceWrapper {
    fn slice(&self) -> &[u8] {
        &self.0
    }
}

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let params = BrotliEncoderParams::default();
    let data = MySliceWrapper(b"Example data to compress".to_vec());
    let mut owned_input = Owned::new(data);
    let mut output = vec![0u8; 10000]; // Adjust the size according to your needs

    let alloc = StandardAlloc::default(); // Adjust based on your allocation strategy

    let mut alloc_per_thread = Vec::with_capacity(CPU_CORES);

    for _ in 0..CPU_CORES {
        alloc_per_thread.push(SendAlloc::new(alloc,UnionHasher::Uninit));
    }

    let result = compress_multi(&params, &mut owned_input, &mut output, &mut alloc_per_thread).unwrap();
    println!("Compressed bytes: {}", result);

    Ok(())
}