Deltares / xugrid

Xarray and unstructured grids
https://deltares.github.io/xugrid/
MIT License
64 stars 8 forks source link

crs is not consistent in Ugrid1d, UgridDataset, from/to_geodataframe, to_netcdf/open_dataset #138

Closed xldeltares closed 1 year ago

xldeltares commented 1 year ago

Description crs is not consistent in the following:

How to reproduce Given:

import xugrid as xu
import geopandas as gpd
from shapely.geometry import LineString, MultiLineString, Point

def lines_gdf() -> gpd.GeoDataFrame:
    points = [Point(-122.3, 47.6), Point(-122.2, 47.5), Point(-122.1, 47.6)]
    lines = [LineString([points[0], points[1]]), LineString([points[1], points[2]])]
    return gpd.GeoDataFrame(geometry=lines, crs=4326)

When creating a Ugrid1d from geodataframe, crs from the geodataframe is not used in Ugrid1d. crs needs to be set again in order for crs to be correctly assigned to Ugrid1d.

grid = xu.Ugrid1d.from_geodataframe(lines_gdf())
grid.crs # returns None
grid.set_crs(4326)
grid.crs # returns crs

Given that now the Ugrid1d does have crs, when creating a UgridDataset from the ugrid1d, the crs is not passed on again. crs needs to be set again in order for crs to be correctly assigned.

uds = xu.UgridDataset(grid.to_dataset(optional_attributes = True))
uds.ugrid.crs # returns {'network1d': None}
uds.ugrid.set_crs(4326)
uds.ugrid.crs # returns crs

Now that I do have crs, when calling to_geodataframe(), the crs is again not passed on. crs needs to be set again in order for crs to be correctly assigned.

gdf = uds.ugrid.to_geodataframe()
gdf.crs # returns None
gdf.set_crs(4326)
gdf.crs # returns crs

Finally, if I write the uds with correct crs to a netcdf file then read it back using open_dataset the crs is lost again. crs needs to be set again in order for crs to be correctly assigned.

uds.to_netcdf("temp.nc")
uds = xu.open_dataset("temp.nc")
uds.ugrid.crs # return None
uds.ugrid.set_crs(4326)
uds.ugrid.crs # return crs

Expected behavior The crs is harmonized among all objects. Otherwise, the user needs to set crs repeatedly.

Huite commented 1 year ago

The inconsistency between geodataframe <-> ugrid shouldn't happen because it's using the same python object. This should be easy to fix.

Going from and to netCDF is more complicated, since there are many ways in how people write CRS information into their netCDF files. I think the CF conventions prescribe writing a grid mapping variable, but I'm pretty sure I have seen other files where there's just an "epsg_code" floating around somewhere.

At any rate, a reasonable fraction seems to contain a grid mapping variable, so we should add some logic to parse and write it. It could get a little tricky since it wouldn't be unreasonable to have the thing refer to ordinary x and y variables of a structured grid... but in general, it seems like a reasonable assumption that if you want to create UgridDataset out of something that xy plane is represented by an unstructured grid, and that the grid mapping refers to that part.

Huite commented 1 year ago

142 also reports the geodataframe conversion issue. This should be fixed now.

With regards to netCDF, I'm closing this in favour of #42, which has a bit more background.