apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.55k stars 3.54k forks source link

[Python] Add row indices in parquet fragment to include/exclude #43897

Open ion-elgreco opened 2 months ago

ion-elgreco commented 2 months ago

Describe the enhancement requested

I would like to be able to specify upfront which row indexes to include while reading or exclude.

We already have this for row_group but it would be useful to have this for row idx.

Component(s)

Python

ion-elgreco commented 2 months ago

Related https://github.com/apache/arrow/issues/35301