Fix isn't quite as simple as removing the final use of validate_schema keyword argument. It was also necessary when identifying which columns to read from the parquet file to check which are classified as columns rather than indexes. I have also simplified the code a bit as it no longer needs a separate load of the metadata before creating the ParquetDataset.
This fix works for pyarrow >= 5 (July 2021). I will try out another PR to support earlier pyarrow but the changes will be wider-ranging as there are a number of places in the code that do not currently support pyarrow < 5 before this PR is considered.
Fixes #109.
Test suite passes using latest
pyarrow == 11.0.0
.Fix isn't quite as simple as removing the final use of
validate_schema
keyword argument. It was also necessary when identifying which columns to read from the parquet file to check which are classified as columns rather than indexes. I have also simplified the code a bit as it no longer needs a separate load of themetadata
before creating theParquetDataset
.This fix works for
pyarrow >= 5
(July 2021). I will try out another PR to support earlierpyarrow
but the changes will be wider-ranging as there are a number of places in the code that do not currently supportpyarrow < 5
before this PR is considered.