holoviz / spatialpandas

Pandas extension arrays for spatial/geometric operations
BSD 2-Clause "Simplified" License
308 stars 25 forks source link

Fix pyarrow and dask parquet issues #92

Closed ianthomas23 closed 2 years ago

ianthomas23 commented 2 years ago

This is the final set of fixes identified in issue #86. CI works locally for me now.

The changes are all related to API changes and deprecations in pyarrow between 5.0.0 and 8.0.0. Changes were required in our code to directly deal with these as well as other changes following dask modifications to deal with the same.

I have created a new script _create_testdata.py to create test parquet files that are stored in the new tests/test_data directory and these are checked as part of pytest. The last time the CI definitely worked was July 2021 with pyarrow==5.0.0 and dask==2021.7.2 (and the same for distributed). These files are successfully read with up-to-date pyarrow==8.0.0 and dask==2022.7.1. Similarly, test parquet files create with the up-to-date pyarrow and dask are successfully read with pyarrow==5.0.0 and dask==2021.7.2.

ianthomas23 commented 2 years ago

CI passing on github actions on ubuntu using Python 3.8. Expanding test matrix to cover Python 3.7 to 3.10 and all 3 major platforms.