pytroll / pyresample

Geospatial image resampling in Python
http://pyresample.readthedocs.org
GNU Lesser General Public License v3.0
344 stars 95 forks source link

Bug where boundary coordinates wrap around 180 degrees when geostationary AreaDefinition extends beyond 180 degrees #530

Open BENR0 opened 1 year ago

BENR0 commented 1 year ago

When the extent of geostationary AreaDefinitions extend beyond 180 degrees boundary coordinates wrap around 180 degrees and derived shapely.Polygons are wrong (this was discovered using the boundary method but due to bug #529 I am using get_geostationary_bounding_box_in_lonlats here instead).

from shapely.geometry import Polygon
from satpy.resample import get_area_def
from pyresample.geometry import get_geostationary_bounding_box_in_lonlats

goes_west = get_area_def("goes_west_abi_p_1km")

x, y = get_geostationary_bounding_box_in_lonlats(geos_west)
coords = list(zip(x, y))
print(coords)
Polygon(coords)

boundary_points_and_polygon boundary_bokeh_plot

Problem description

In this case the top left corner of the boundary gets wrapped around 180 degrees and therefore the resulting polygon is wrong. It can't be used for geo selecting points and when displayed the polygon get's further simplified because I think polygon lines are not allowed to cross themselves.

The reason for this is that Proj by default wraps coordinates during transformation. This can be changed by adding +over to the proj4 CRS string. In Pyresample WKT CRS is used as should be and as far as I could find out this flag can not be directly set in WKT CRS. Instead if a CRS is initialized with +over the WKT representation gets a REMARK section with the proj4 string which seems to be respected when doing transformations with Proj.

goes_west_copy = goes_west.copy(projection=goes_west.crs.to_proj4() + " +over")

This leads to the expected output below.

Expected Output

boudary_points_and_polygon_expected boundary_bokeh_expected_output

Versions of Python, package at hand and relevant dependencies

Python: 3.10.8 Pyresample: v1.27.1

djhoese commented 1 year ago

There is an old PROJ mailing list thread started by me where I discuss this +over with one of the PROJ developers. My understanding is that +over should not be used and may be considered deprecated. Instead, it was recommended to me to use +pm=180 or some other equivalent to basically define the projection with the prime meridian not at 0 but instead at 180 degrees. This means your data/polygon values have a contiguous -20 to 20 degrees (pm=180) rather than 160 to -160 (pm=0).

So the longitudes are in the range -180 to 360 technically? Basically going from ~170 to ~220 to keep the pixel offset always positively increasing. Or wait, who is doing the wrapping here? What are the longitudes produced by the AreaDefinition? What projection are you using for mapping/drawing?

BENR0 commented 1 year ago

I don't now if this bug is also relevant to the getting full lat/lon arrays. The reason for the bug is that for the boundary calculation in lons/lats the bounding box is calculated in projection coords (geostationary) and then "reprojected" using Proj (https://github.com/pytroll/pyresample/blob/07382ccac9116620c3a9caa6a47e8dadf64c565d/pyresample/geometry.py#L2823) which per default wraps coordinates if they cross the 180 degree line which is the case for the GOES West AreaDefinition.

I also got the feeling that the usage of +over was not really encouraged but it was the only way I found to get to the "correct" results. In any case its not only a matter of mapping/drawing (this was only to make the problem clear) because if you look at the coordinates and the shapely polygon it is obvious that the boundary is wrong. So if I would do a spatial selection/sjoin I would get wrong results.

While the pm=180 does not mess up the geometry of the polygon the coordinates are shifted and downstream tasks like mapping/selecting are rather difficult and counterintuitive. I think most of the time the recommendation is to split the polygons at the dateline (https://towardsdatascience.com/around-the-world-in-80-lines-crossing-the-antimeridian-with-python-and-shapely-c87c9b6e1513). I don't know what the best way to go here is. I guess this also depends on the usage of the boundary function.

While a different discussion and maybe special to my usage; but I think it would make sense if boundary would return a shapely polygon instead of coordinate arrays. That would make it easier to work with (e.g. plotting/ geopandas).

djhoese commented 1 year ago

Some of the work by @ghiggi handles these cases by internally having a Boundary object that contains two shapely polygons for the dateline cases. I don't remember where that implementation lives between future boundary/spherical polygon classes and current classes.

I think I'd prefer if the lon/lat coordinate generation for the bounds did a wrap check manually (especially since bounding coordinates should not have very many elements). So if max(lon) - min(lon) is greater than 355 and we're not touching the north or south pole, then let's add 360 to the < 0 coordinates. I would at least expect that to work better than what is currently produced and doesn't depend on possibly deprecated PROJ flags. I'm not sure if this would break @ghiggi's checks in the boundary code.

What do you think?

ghiggi commented 1 year ago

I am a bit tired right now, but I think the issue here @BENR0 is something else. I guess your Polygon vertices are defined clockwise, while shapely expects to be defined as counterclockwise. Can you try to plot with counterclockwise shapely polygon vertices (with i.e. [, ::-1]) and use the the cartopy Geodetic CRS?

ghiggi commented 1 year ago

@djhoese in the future spherical polygon interface I found a way to not need two polygons in the dateline cases :D I just need to find the time to prepare the PR and implement the test units ...

BENR0 commented 1 year ago

@djhoese I agree. The PROJ flags solution, while working or at least giving the results I wanted, seems a little fishy. A manual check would be better I think. From what I grasp I am not sure if it s possible to get away with a single polygon instead of two if it would be crossing the date line but I am very curious what you've come up with @ghiggi.

@ghiggi the counterclockwise solution won't work, at least for the "distorted" shapely polygon display, because the one coordinate that is over the dateline and thus positive is due to the inverse projection using Proj and shapely does not know anything about the projection/wrapping.

Nevertheless I found out something interesting; the geo plot you see above is made from a geopandas dataframe using hvplot using bokeh which is wrong obviously. I now also did a plot just with cartopy and in that case the polygon looks right. Interestingly this is the case in the clockwise and counterclockwise case even though there is some weird artefact with the background map in the clockwise case. But still cartopy obviously knows how to figure out the dateline crossing.

Spatial join also seem to be correct in the clockwise and counterclockwise cases

So I guess this is a bug with hvplot/bokeh somehow which mislead me.

I don't know if this is still a bug then? I guess it would make sense to change the default to counterclockwise since this is also what shapely expects?

ghiggi commented 1 year ago

All pyresample polygon spherical routines expect clockwise-ordered vertices ... and would be an ultra-pain to change. To what you refer with spatial join? Because the vertices orders defines what is the outside and inside of a polygon. And cartopy expects counterclockwise vertices. You don't see the weird behavior with clockwise vertices because you just plot the border of the polygon ... but try to fill it and you will see the problem :smile:

Did you try to use Geoviews for the plots? You would still rely on bokeh but you could specify the cartopy geodetic crs and solve the issue I guess ....