apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.65k stars 3.55k forks source link

GH-44651: [Python] Allow from_buffers to work with StringView on Python #44701

Closed raulcd closed 1 week ago

raulcd commented 1 week ago

Rationale for this change

Currently from_buffers is not working with StringView on Python because we validate against num_buffers. This only take into account the mandatory buffers but does not take into account the variadic_spec that can be present for both string_view and binary_view

What changes are included in this PR?

Take into account whether the type contains a variadic_spec for the non-mandatory buffers and only check lower_bound number of buffers.

Are these changes tested?

Yes, I've added a couple of tests.

Are there any user-facing changes?

We are exposing a new method on the Python DataType. has_variadic_buffers which tells us whether the number of buffers expected is only lower-bounded by num_buffers.

conbench-apache-arrow[bot] commented 1 week ago

After merging your PR, Conbench analyzed the 3 benchmarking runs that have been run so far on merge-commit aa26f28a64b7638f01756d78a2ea8fbddceafc65.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 27 possible false positives for unstable benchmarks that are known to sometimes produce them.