Open alamb opened 4 days ago
Can I work on this issue? kindly assign it to me!
Hi @myeunee, you can just comment "take" and it will be automatically assigned to you.
Can I work on this issue? kindly assign it to me!
Also in general, feel free to work on any issue -- https://datafusion.apache.org/contributor-guide/index.html#finding-and-creating-issues-to-work-on 🚀
Is your feature request related to a problem or challenge?
Part of https://github.com/apache/datafusion/issues/10922
We are adding APIs to efficiently convert the data stored in Parquet's "PageIndex" into
ArrayRef
s -- which will make it significantly easier to use this information for pruning and other tasks.Describe the solution you'd like
Add support to
StatisticsConverter::min_page_statistics
andStatisticsConverter::max_page_statistics
for the types abovehttps://github.com/apache/datafusion/blob/a923c659cf932f6369f2d5257e5b99128b67091a/datafusion/core/src/datasource/physical_plan/parquet/statistics.rs#L637-L656
Describe alternatives you've considered
You can follow the model from @Weijun-H in https://github.com/apache/datafusion/pull/10931
Check::Both
, following the model oftest_int64
https://github.com/apache/datafusion/blob/a923c659cf932f6369f2d5257e5b99128b67091a/datafusion/core/tests/parquet/arrow_statistics.rs#L506-L529get_datapage_statistics
: https://github.com/apache/datafusion/blob/459afbb3a180d31e7cdefffb46f033069aa47408/datafusion/core/src/datasource/physical_plan/parquet/statistics.rs#L624 (follow the model of the row counts, https://github.com/apache/datafusion/blob/2f4347647172f6997448b2e24d322b50c856f3a0/datafusion/core/src/datasource/physical_plan/parquet/statistics.rs#L90)Typically the change to the test looks like
Additional context
No response