Open hsteptoe opened 4 months ago
Thanks @hsteptoe I do think this is interesting. We have all been watching SciPy2024 last week, and noted the prevalence of Geopandas there. AFAICT there are some solutions out there for importing geopandas into xarray, notably geocube which is noted in xarray docs. Which ought to be usable, via ncdata. But not, I think the in the reverse direction (i.e. write xarray data to geopandas) ?
Just been chatting to @hsteptoe offline. He works with some software that insists on GeoPandas format, I think there are enough other geo-referenced tabular formats - mostly relying on polygon information, it seems - that it's a space worth investigating.
I'm wondering about a callable utility that would add Shapely polygon information to a given Cube
as an AuxCoord
or AncillaryVariable
. The existing iris.pandas
interface could be programmed to detect this and handle it accordingly?
It would presumably also be possible to construct a grid from a series of polygons. This would be required for the reverse interoperability, and I know there are other use cases for this (@gcsima brought me one this year).
@hsteptoe might have some spare cycles to look into this, certainly earlier than the Iris maintainers could get to it.
I think my instinct is to add an iris.geopandas
module, mainly to respect the pandas
vs geopandas
difference. A pandas.DataFrame
isn't automatically recognised as a geopandas.GeoDataFrame
if it has a column of geometry information, so I don't think users should expect the iris.pandas
to do this.
geopandas
also doesn't have a native method to take a pandas.DataFrame
to a geopandas.GeoDataFrame
, so we might as well write code to do iris.Cube
<-> geopandas.GeoDataFrame
, rather than to go via a pandas.DataFrame
as an intermediate step.
My API suggestions would be something like (equivalent to iris.pandas
):
>>> from iris.geopandas import as_geo_data_frame
>>> import geopandas as gpd
>>> cube = iris.load_cube(path)
>>> gdf = as_geo_data_frame(cube)
I think my instinct is to add an
iris.geopandas
module
I could see the case for not having the existing routines 'magically' do two different things, but I'd still rather see any new routines put into iris.pandas
since they are such related concepts.
OK, so from iris.pandas import as_geo_data_frame
would be a reasonable compromise?
OK, so
from iris.pandas import as_geo_data_frame
would be a reasonable compromise?
Yes that's the kind of thing I meant
geopandas
and pyvista
, vtk
and geovista
.
- [ ] Need to work out why there seems to be a dependency conflict with
geopandas
andpyvista
,vtk
andgeovista
.
https://github.com/SciTools/iris/issues/5517#issuecomment-1771315944
✨ Feature Request
Build interface for translating between Iris cubes and GeoPandas dataframes.
Motivation
GeoPandas is quickly becoming a key package for working with geospatial data in python.
We have a
Iris <-> pandas
interface, but should this be extended to GeoPandas?In principal we could do
Iris <-> pandas <-> GeoPandas
... but we could also make this more user-friendly.Is this within scope of what Iris should do? Thoughts?