SciTools / iris

A powerful, format-agnostic, and community-driven Python package for analysing and visualising Earth science data
https://scitools-iris.readthedocs.io/en/stable/
BSD 3-Clause "New" or "Revised" License
635 stars 283 forks source link

`_transform_coord_system` geometry is always assume to be `None` #6126

Open hsteptoe opened 2 months ago

hsteptoe commented 2 months ago

🐛 Bug Report

In _shapefiles._transform_coord_system the geometry_system var is always None which results in masking geometries always being assume to be DEFAULT_CS which is a lon/lat CRS.

https://github.com/SciTools/iris/blob/b8f554f7b8899f2ff6754857d59fc48944ef8e9b/lib/iris/_shapefiles.py#L94 👇 https://github.com/SciTools/iris/blob/b8f554f7b8899f2ff6754857d59fc48944ef8e9b/lib/iris/_shapefiles.py#L119

Even if the cube and masking geometry have the same CRS and are in a non-lat/lon CRS (e.g. OSGB) the masking result will wrongly assign the masking geometry a lon/lat CRS. The function doesn't fail, but results in an incorrect masking result.

Expected behaviour

These functions need better handling of non-lat/lon CRSs by retrieving the true CRS (where it exists).

Environment

ESadek-MO commented 2 months ago

@SciTools/peloton.

Can we ask what specifically you want to achieve with this? To change this would require significant high level changes because shapely doesn't support coordinate systems.

hsteptoe commented 2 months ago

@SciTools/peloton.

Can we ask what specifically you want to achieve with this? To change this would require significant high level changes because shapely doesn't support coordinate systems.

At present, trying to mask a cube with an OSGB CRS with a shapefile geometry that also has a OSGB CRS fails to give sensible results because the _transform_coord_system forces the shapefile to assume a lat/lon CRS. If the geometry_system kwarg was better exposed to the masking function this error can be avoided. I'm currently experimenting with a fix...

acchamber commented 2 months ago

Thanks for pointing this out Hamish - some thoughts on why I implemented it how it is (which doesn't mean it shouldn't be fixed!)

As you can guess from the existence of a keyword, I did play with including this option. Shapely doesn't include any metadata when loading a shape - it's just a 2D plane (or 3D if it's a 3D shape). So to test this properly we'd need shapefiles in the right coord system to test this - and when I wrote this mostly thinking of RCM rotated pole use cases, I didn't have any to hand for testing and didn't see a pressing need. If a user did have a shapefile in a odd coord system, then just manually transforming it using cartopy to the default coord system would be a workaround.

I didn't think about the OSGB use case, where you are right this makes substantual sense. There's a large amount of geographic data provided on OSGB grids (we provide climate data on it for a reason) so you can have both your cube and shapefile on the same coord system and Iris refuse to cooperate, which is frustrating as a user.

I'm not sure what fix you are thinking of implementing, but one idea might be a "override" keyword (default False) on mask_cube_from_shapefile where it assumes the Cube and Shapefile are on the same grid already and doesn't try any transformations. That gets ride of much of the testing complexity of doing grid-to-grid conversions between non-standard projections while letting users mask their cubes when they know it should just work.

hsteptoe commented 2 months ago

Yes, it might be that a pragmatic solution like an 'override' keywords is best... but equally exposing the geometry_system is actually pretty helpful. There is nothing wrong with doing a transform, but we need an adjustment so the masking process isn't actively detrimental. Assuming a default CRS might be fine where the units of the cube are in degrees.

I'm increasingly viewing geopandas as a better way to handle shapefiles as it handles loading and attaches CRS data to its geodataframes that facilitates easier translation between different CRSs, but this would be a more substantial change and an extra dependency (which is currently not resolvable https://github.com/SciTools/iris/issues/6047#issuecomment-2231109447)