Run RLE before distributing work on threads

torfmaster / ribzip2

A bzip2 implementation in pure Rust.

MIT License

16 stars 11 forks source link

Run RLE before distributing work on threads #17

Closed NorbertGarfield closed 2 years ago

NorbertGarfield commented 2 years ago

Solves https://github.com/torfmaster/ribzip2/issues/4

Calculate RLE data in the main thread
Send calculated data to generate_block_data
Move rle::rle from block_encoder to stream

torfmaster commented 2 years ago

Hi, thank you for your PR. Inside the issue there is one more challenge, that you could take: Now, that RLE is performed before the distribution on threads, we can perform RLE on a chunk of bytes that is larger than 720.000, if the outcome is not larger than 900.000 bytes. This can be incorporated, for example, by refactoring RLE into a reader which reads from the input reader until 900.000 of outcome are achieved (or just fill a buffer, if a reader is overkill). Are you interested in doing this change as well so that the issue can be closed?

NorbertGarfield commented 2 years ago

Are you interested in doing this change as well so that the issue can be closed?

Yes, by all means. Should I amend the changes to this PR? Or do I need to open another one?

torfmaster commented 2 years ago

Great! You can add your changes to the PR.

NorbertGarfield commented 2 years ago

I might be over-engineering this, but my best shot is:

Split RLE subroutine into two independent steps (forward loop and augmentation).
Take the number of bytes equal to the difference between RLE limit and current RLE data (consider safety margin).
Calculate RLE forward loop, get the size of augmentation, store the state.
If total RLE size is less than the limit, do one more iteration considering saved state. If RLE size is more than the limit -- halt.

I could not find the policy regarding re-bases, so I re-based my branch to the latest HEAD of the main branch. Let me know if that's fine.

torfmaster commented 2 years ago

I just wanted to express how grateful I am about your PR. At least in my benchmarks the compression rate (using k-means-clustering for Huffman) the compression rate is now close to the original implementation!