Hello and thank you for the amazing package that computes (exact) zonal statistics in a fast and reliable way. I am using Python with geopandas and xarray/rioxarray together with exactextract and have found an issue in your code.
geopandas uses pyproj, which defaults to WKT2 (WKT2_2019).
rioxarray uses rasterio, which defaults to WKT1 (WKT1_GDAL).
This results in discrepancies even when both datasets have the same CRS (e.g., EPSG:4326). As a result, the CRS comparison gives a warning, even though the underlying coordinate systems are identical ("Spatial reference system of input features does not exactly match raster.")
This behavior can be reproduced with the following example:
import geopandas as gpd
import rioxarray
import xarray as xr
import numpy as np
from pyproj import CRS
from exactextract.exact_extract import crs_matches
from exactextract.feature import GeoPandasFeatureSource
from exactextract.raster import XArrayRasterSource
# Define the CRS as EPSG:4326
crs = CRS.from_epsg(4326)
# Create a simple GeoDataFrame
gdf = gpd.GeoDataFrame(geometry=[], crs=crs)
# Create a simple xarray Dataset
data = np.random.rand(5, 5)
lat = np.linspace(-90, 90, 5) # y-dimension (latitude)
lon = np.linspace(-180, 180, 5) # x-dimension (longitude)
# Create the Dataset
ds = xr.Dataset(
{
"band1": (["y", "x"], data)
},
coords={
"y": lat,
"x": lon,
},
)
ds = ds.rio.write_crs(crs) # Write the CRS to the dataset
# Print WKT representations
print("GeoDataFrame CRS (WKT):")
print(gdf.crs.to_wkt()) # WKT2
# GEOGCRS["WGS 84",ENSEMBLE["World Geodetic System 1984 ensemble",MEMBER["World Geodetic System 1984 (Transit)"],MEMBER["World Geodetic System 1984 (G730)"],MEMBER["World Geodetic System 1984 (G873)"],MEMBER["World Geodetic System 1984 (G1150)"],MEMBER["World Geodetic System 1984 (G1674)"],MEMBER["World Geodetic System 1984 (G1762)"],MEMBER["World Geodetic System 1984 (G2139)"],ELLIPSOID["WGS 84",6378137,298.257223563,LENGTHUNIT["metre",1]],ENSEMBLEACCURACY[2.0]],PRIMEM["Greenwich",0,ANGLEUNIT["degree",0.0174532925199433]],CS[ellipsoidal,2],AXIS["geodetic latitude (Lat)",north,ORDER[1],ANGLEUNIT["degree",0.0174532925199433]],AXIS["geodetic longitude (Lon)",east,ORDER[2],ANGLEUNIT["degree",0.0174532925199433]],USAGE[SCOPE["Horizontal component of 3D system."],AREA["World."],BBOX[-90,-180,90,180]],ID["EPSG",4326]]
print("\nDataset CRS (WKT):")
print(ds.rio.crs.to_wkt()) # WKT1
# GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]
# Attempt to compare WKT strings
if gdf.crs.to_wkt() == ds.rio.crs.to_wkt():
print("\nCRS match.")
else:
print("\nCRS mismatch.")
# Attempt to compare CRS objects as exactextract does
if not crs_matches(GeoPandasFeatureSource(gdf), XArrayRasterSource(ds)):
print("\nCRS mismatch (exactextract).")
else:
print("\nCRS match (exactextract).")
So even though the crs of the GeoDataFrame and the Dataset were created from the same pyproj.CRS object, exactextracts gives a warning that the is a CRS mismatch.
As a possible solution I would propose to just compare EPSG codes instead of WKT representations.
If you have the GDAL Python bindings available, it will use those to check CRS equality. I guess another fallback could be added to use pyproj to do the same.
Hello and thank you for the amazing package that computes (exact) zonal statistics in a fast and reliable way. I am using Python with geopandas and xarray/rioxarray together with exactextract and have found an issue in your code.
The CRS comparison: https://github.com/isciences/exactextract/blob/cd047bb12c073b640332dc8582c0f0c72e173e0b/python/src/exactextract/exact_extract.py#L298-L323 relies on comparing the WKT representation of the CRS. However,
geopandas
andrioxarray
handle WKT strings differently:EPSG:4326
). As a result, the CRS comparison gives a warning, even though the underlying coordinate systems are identical ("Spatial reference system of input features does not exactly match raster."
)This behavior can be reproduced with the following example:
So even though the crs of the
GeoDataFrame
and theDataset
were created from the samepyproj.CRS
object, exactextracts gives a warning that the is a CRS mismatch.As a possible solution I would propose to just compare EPSG codes instead of WKT representations.