estuary / flow

🌊 Continuously synchronize the systems where your data lives, to the systems where you _want_ it to live, with Estuary Flow. 🌊
https://estuary.dev
Other
638 stars 56 forks source link

dekaf: Swap to `lz4_flex` from `lz4` #1653

Closed jshearer closed 1 month ago

jshearer commented 1 month ago

While investigating the cause of LZ4 compression issues related to franz-go (https://github.com/estuary/flow/pull/1651#issuecomment-2369322939), I found lz4_flex which is a pure-Rust lz4 implementation which appears to be safer and faster than lz4/lz4-sys that kafka-protocol is using.

Now that https://github.com/tychedelia/kafka-protocol-rs/pull/81 allows us to use our own compression, and lz4's configuration of block checksums is broken (fix here https://github.com/10XGenomics/lz4-rs/pull/52), I thought it would be a good time to swap to lz4_flex.

I've confirmed that this change also fixes the issue with block checksums, mainly because lz4_flex's block checksums are actually off by default (unfortunately, turning them on reproduces the issue).


This change is Reviewable