Open broccoliSpicy opened 1 week ago
in binary + miniblock, I think this copy can be avoided if we align each chunk to 4 bytes.
We already align each chunk to 8 bytes but there was a bug preventing this from working correctly.
in fastlanes bitpacking, because the use of SIMD instruction, the alignment requirement is stronger
How much stronger? 64 byte alignment for a 4KiB chunk seems pretty extensive (up to 6% of the block is wasted space) but I suppose it is manageable. Still, if we want to require this then I'd prefer changing the MiniBlockCompressor
trait to allow compressors to state how much alignment they need. This way compressors that don't need such strict requirements don't have to pay.
(up to 6% of the block is wasted space)
I did the math wrong. 64/4096 is 1/64th so more like 1-2%.
https://github.com/lancedb/lance/blob/c237bcb9318d30cf382aecd56b673aae85b2c555/rust/lance-encoding/src/encodings/physical/bitpack_fastlanes.rs#L1727
instantiate a copy for the data using .to_vec
doesn't guarantee the data starts at a 64 byte aligned
position either, we may be able to get rid of this copy with the currently page layout padding, if so, changes in bitpack mini-block compression logic needed.
PR #3101 added alignment in page layout and chunk layout, but PR #3099 still need to do a copy of the raw data read from disk to start decoding, see the code here
for fastlanes bitpacking, there is also a copy https://github.com/lancedb/lance/blob/c237bcb9318d30cf382aecd56b673aae85b2c555/rust/lance-encoding/src/encodings/physical/bitpack_fastlanes.rs#L1727
in
binary + miniblock
, I think this copy can be avoided if we align each chunk to 4 bytes. infastlanes bitpacking
, because the use of SIMD instruction, the alignment requirement is stronger, reference here, we may also need to change the compression logic to allow it