jorgecarleitao / arrow2

Transmute-free Rust library to work with the Arrow format
Apache License 2.0
1.06k stars 222 forks source link

Off by 1 in slice of OffsetsBuffer #1531

Open kylebarron opened 1 year ago

kylebarron commented 1 year ago

The implementations of slice on OffsetsBuffer calls slice on the underlying Buffer directly:

https://github.com/jorgecarleitao/arrow2/blob/2ecd3e823f63884ca77b146a8cd8fcdea9f328fd/src/offset.rs#L443-L459

But this is incorrect because then a slice with length 1 returns an OffsetsBuffer with only one value in the buffer.

I'd argue that either the documentation could be better that this is slicing the raw buffer and not the offsets that a slice of values of that length would have. Or the code could add the +1 itself.

ritchie46 commented 1 year ago

I don't think this is incorrect? It slices the buffer? It doesn't do any semantic handling.

kylebarron commented 1 year ago

I found it confusing because at least in a couple places, the API does do semantic handling instead of raw buffer operations. E.g. with_capacity reserves space for n + 1 elements, so the input number refers to the number of values, not the number of offsets.

In any case, it's probably too disruptive to change the API now, so a docstring improvement would probably be best