Closed austin362667 closed 1 week ago
I think we probably want to get some broader thoughts on this, whilst I agree unsigned is likely more appropriate in the Rust ecosystem, lots of systems, arrow included, prefer signed quantities for compatibility reasons (see StringArray, ListArray, etc...)
Either way, this isn't a bug and would be a breaking change
Specifically the Arrow spec uses i32
/i64
typically for lengths for compatibility with Java (which does not support unsigned integers 🤷 )
So I think maybe we can close this issue?
Describe the bug
The string array functions
length()
andbit_length()
should not returnlength
as signed integer.To Reproduce
See https://github.com/apache/arrow-rs/blob/5ad621fd97032796f3da2d02948e65513de8c891/arrow-string/src/length.rs#L47 and https://github.com/apache/arrow-rs/blob/5ad621fd97032796f3da2d02948e65513de8c891/arrow-string/src/length.rs#L111
Expected behavior
Should return unsigned integer, since there are no negative length in our context.
Additional context
Thanks for @alamb bringing this up in the discussion on https://github.com/apache/arrow-rs/pull/6671#discussion_r1826570314
Note: The actual Arrow spec uses integer types.