GenericMappingTools / pygmt

A Python interface for the Generic Mapping Tools.
https://www.pygmt.org
BSD 3-Clause "New" or "Revised" License
747 stars 216 forks source link

GeoPandas: Migrate the engine from fiona to pyogrio #3231

Closed seisman closed 4 months ago

seisman commented 4 months ago

GeoPandas v1.0.0-alpha1 has been released (https://geopandas.org/en/latest/docs/changelog.html#version-1-0-0-alpha1-apr-13-2024). The biggest change is that the default I/O engine is changed from Fiona to pyogrio.

So, when geopandas v1.0.0 is released, PyGMT will fail to work, because:

  1. Fiona is no longer a required dependency of geopandas, so it's not installed by default
  2. The pyogrio engine doesn't support scheme parameter so some codes will break.

I think we should start the migration now following the instructions.

  1. Add pyogrio as a optional dependency since geopandas 0.x doesn't install it by default
  2. Add engine="pyogrio" to use the pyogrio
  3. Cleanup the codes and avoid using fiona

Thoughts @GenericMappingTools/pygmt-maintainers especially @weiji14?

seisman commented 4 months ago

We also have some mysterious failures with Fiona as mentioned in https://github.com/GenericMappingTools/pygmt/issues/3180#issuecomment-2081801690. Migrating from Fiona to pyogrio also help us avoid such failures.

weiji14 commented 4 months ago

Not sure if we can migrate completely away from fiona yet, at least in terms of writing out OGR_GMT files. Pyogrio doesn't list OGR_GMT under https://pyogrio.readthedocs.io/en/latest/supported_formats.html#full-read-and-write-support, though it seems to depend on what GDAL was compiled with. We might just need to test things out and see first.

seisman commented 4 months ago

OGR_GMT is built-in by default in GDAL (https://gdal.org/drivers/vector/gmt.html), so we should be safe unless some package distributors decide to disable OGR_GMT explicitly.

weiji14 commented 4 months ago

Ok, opened a proof of concept PR at #3237 to switch from fiona to pyogrio. The way it works is that we get a GeoJSON representation from __geo_interface__, which pyogrio reads into a geopandas.GeoDataFrame, and that gets output to an OGR_GMT file.

Downside is that this means geopandas would become a stronger optional dependency. Before, someone could install just fiona without geopandas and plot shapely objects, but now, doing so via pyogrio would require geopandas.

seisman commented 4 months ago

Maybe we can try geopandas first, then pyogrio, then fiona? But we may still see fiona errors reported in https://github.com/GenericMappingTools/pygmt/issues/3180#issuecomment-2081801690.