Open asfimport opened 2 years ago
Ivan Sadikov: I will update the description later and I would like to open a PR to fix the issue. I think we just need to check if the column set is empty or not when checking paths in the ColumnIndexFilter but I will need to confirm this.
Discovered in Spark, when returning an empty projection from a Parquet file with filter pushdown enabled (typically when doing filter + count), Parquet-Mr returns a wrong number of rows with column index enabled. When the column index feature is disabled, the result is correct.
This happens due to the following:
This results in the incorrect number of records reported by the library.
I will provide the full repro later.
Reporter: Ivan Sadikov
Related issues:
Note: This issue was originally created as PARQUET-2170. Please see the migration documentation for further details.