apache / arrow-rs

Official Rust implementation of Apache Arrow
https://arrow.apache.org/
Apache License 2.0
2.62k stars 802 forks source link

Fix Buffer::bit_slice losing length with byte-aligned offsets #6707

Closed itsjunetime closed 5 days ago

itsjunetime commented 1 week ago

Which issue does this PR close?

A part of #3478; necessary for #6690, which closes the aforementioned issue.

Rationale for this change

If bit_slice is called with a given length, the returned buffer should have the specified length.

What changes are included in this PR?

The bit_slice function itself is updated with the fix, along with a unit test to ensure the fix works. I've ensured the test fails without my changes.

I also changed the 'minimum overhead' value in an encoding test that went down due to this change.

This also updates the MSRV of this crate in Cargo.toml to 1.75, as that was (and still is, with this PR) the effective MSRV of these crates (as found by cargo-msrv). This was necessary since I wanted to use usize::div_ceil, and that was stabilized in 1.73, but clippy was complaining that I couldn't use it since the crate's msrv was 1.62.

Are there any user-facing changes?

Yes, a bug fix that could change user behavior.

itsjunetime commented 1 week ago

Ah, it looks like cargo msrv find tries to find the minimum available version that would work for all the crates in your workspace, including e.g. the parquet bin target and its clap dependency. I'll fix that and use bit_util::ceil instead.

itsjunetime commented 5 days ago

I've removed the MSRV changes - I thought that was going to be a simple change initially, then it ended up branching out to a lot of other things. I'm going to make a separate issue and PR here to fix that specifically.