geopandas / pyogrio

Vectorized vector I/O using OGR
https://pyogrio.readthedocs.io
MIT License
260 stars 22 forks source link

TST: drop older GDAL, require shapely for tests, consolidate arrow test parametrize #286

Closed brendan-ward closed 9 months ago

brendan-ward commented 10 months ago

This raises the minimum supported version of GDAL to 3.5.* (note: no compatibility change, just dropping old versions from CI).

This updates the CI matrix to use newer Docker images from ghcr.io (current home of GDAL images).

This adds _compat.py to check the state of optional dependencies and GDAL features like Arrow API or GEOS. This will give us a place in later PRs to set variables to use Arrow by default.

This drops shim to use PyGEOS; otherwise the conditional presence of it was cluttering up tests esp. those around #285. While GeoPandas support should continue to work with PyGEOS instead of Shapely 2.0, we no longer test that specifically and no longer list that as a compatible setup. The tests assume that if GeoPandas is present that Shapely >= 2.0 is present.

This adds a new parametrize @use_arrow which will provide a boolean value of at least False and will include True if arrow support is enabled. This should make it easier to mark additional tests to test with and without arrow support.

This adds new pytest marks to make skipping tests a little easier:

This drops direct tests that use .json extension without specifying driver, because recent versions of GDAL added another driver for that extension and we raise errors when there are multiple. Since this comes from GDAL rather than an intentional breaking change on our end, I didn't note this in the changelog (should we?).

theroggy commented 10 months ago

I wonder, regarding the following remark in #285 :

The issue is that OGR will give different results if GEOS is or is not compiled into GDAL. Using shapely afterward ensures precise results.

Is there now a test environment available in the CI where gdal is compiled without geos?

brendan-ward commented 10 months ago

Is there now a test environment available in the CI where gdal is compiled without geos?

Not that I am aware of; I think we'd have to compile GDAL from source to get that setup. For #285, I opted to use the same approach as for bbox, which means it is limited by GDAL rather than an operation we implement on our end (using shapely). We can sidestep the issue a bit by making sure that GDAL in our wheels is always compiled using GEOS.

theroggy commented 10 months ago

Is there now a test environment available in the CI where gdal is compiled without geos?

Not that I am aware of; I think we'd have to compile GDAL from source to get that setup. For #285, I opted to use the same approach as for bbox, which means it is limited by GDAL rather than an operation we implement on our end (using shapely). We can sidestep the issue a bit by making sure that GDAL in our wheels is always compiled using GEOS.

Ah, OK, I didn't "read" the PR yet. If GDAL is being used for the filtering and it is not planned to implement a fallback (for now) if GDAL isn't compiled with geos it is not as important to have such a test environment.

If a testing environment with GDAL without geos support would be wanted, the alpine-small probably is an easy candidate as mentioned in this comment: https://github.com/geopandas/pyogrio/issues/277#issuecomment-1702612350

brendan-ward commented 9 months ago

I added 3.4 back in as minimum supported version; the one issue I wanted out of the way - detecting GEOS via GDAL - was limited to < 3.4, so this should be fine. Just to stay safe from version-specific behavior, it is probably a good idea for us to continue testing all versions between our minimum and latest, and they are reasonably fast.