pytroll / pyresample

Geospatial image resampling in Python
http://pyresample.readthedocs.org
GNU Lesser General Public License v3.0
344 stars 95 forks source link

Index Error when calling `boundary` with non full disk geos ara #529

Open BENR0 opened 1 year ago

BENR0 commented 1 year ago

When trying to get the boundary of a non full disk geostationaryAreaDefinition currently an IndexError will be thrown.

from satpy.resample import get_area_def

goes_east = get_area_def("goes_east_abi_c_1km")

goes_east.boundary()

Expected Output

The boundary lons/lats should be returned as in the full disk case.

Actual Result, Traceback if applicable

IndexError                                Traceback (most recent call last)
Cell In[3], line 5
      1 from satpy.resample import get_area_def
      3 goes_east = get_area_def("goes_east_abi_c_1km")
----> 5 goes_east.boundary()

File /dev/pytroll/pyresample/pyresample/geometry.py:1614, in AreaDefinition.boundary(self, frequency, force_clockwise)
   1612 from pyresample.boundary import AreaBoundary
   1613 if self.is_geostationary:
-> 1614     lon_sides, lat_sides = self._get_geo_boundary_sides(frequency=frequency)
   1615 else:
   1616     lon_sides, lat_sides = self.get_bbox_lonlats(frequency=frequency,
   1617                                                  force_clockwise=force_clockwise)

File /dev/pytroll/pyresample/pyresample/geometry.py:1585, in AreaDefinition._get_geo_boundary_sides(self, frequency)
   1580 # Retrieve dummy sides for GEO (side1 and side3 always of length 2)
   1581 side02_step = int(frequency / 2) - 1
   1582 lon_sides = [lons[slice(0, side02_step + 1)],
   1583              lons[slice(side02_step, side02_step + 1 + 1)],
   1584              lons[slice(side02_step + 1, side02_step * 2 + 1 + 1)],
-> 1585              np.append(lons[side02_step * 2 + 1], lons[0])
   1586              ]
   1587 lat_sides = [lats[slice(0, side02_step + 1)],
   1588              lats[slice(side02_step, side02_step + 1 + 1)],
   1589              lats[slice(side02_step + 1, side02_step * 2 + 1 + 1)],
   1590              np.append(lats[side02_step * 2 + 1], lats[0])
   1591              ]
   1592 return lon_sides, lat_sides

IndexError: index 49 is out of bounds for axis 0 with size 7

Versions of Python, package at hand and relevant dependencies

Python: 3.10.8 Pyresample: v1.27.1

djhoese commented 1 year ago

Oh no, what terrible timing to bring up this issue. @mraspaud just started his month long holiday and he has the most familiarity with this. It is possible some of this might get cleaned up with:

https://github.com/pytroll/pyresample/pull/526

since there was confusion caused by the naming of the frequency/number of pixels when generating the lon/lat arrays.

The frequency should have been set to 50 by default if I'm reading the source code correctly. So that's side02_step = 50 / 2 - 1 = 24. So then the index here would be 24 * 2 + 1 = 49, so that makes sense. The geostationary methods for generating the lons/lats seem to use np.linspace to generate the original bounding arrays so then the only reason for this to get cut down to 7 (as the error message says) is if this intersection produces that small result:

https://github.com/pytroll/pyresample/blob/07382ccac9116620c3a9caa6a47e8dadf64c565d/pyresample/geometry.py#L2785-L2793

So is it possible this intersection logic is not handling the anti-meridian or something? Oh but your example here is GOES East...that should be fine, right? You/we might need to add some prints to figure out why the geos boundary is so small (7 elements).

djhoese commented 1 year ago

Oh or is this a shapely or numpy change that is causing this?

BENR0 commented 1 year ago

I saw #526 but did not read through it yet. From the code I was a little confused since frequency did not really match up with the number of points I got which I guess is the reason for the mentioned PR. I don't think this is due to numpy or shapely since get_geostationary_bounding_box_in_lonlats works fine. Additionally for a full disk AreaDefinition there is no IndexError.

I think the reason is that for a non full disk AreaDefinition the lat/lon arrays are calculated with the given frequency for the full disk but are later "cut" down due to the intersection but at the same time theboundary method still uses the "full" frequency/number of points for the slicing.

djhoese commented 1 year ago

This might have to be a @mraspaud answer (and @pnuu if he's available). Basically I see two possible solutions:

  1. Do a min on the index used in the indexing operation shown in the original traceback. So it figures out the indexing it should used based on the frequency, but goes no further than the number of points in the array...is this possible? Maybe not because we don't know how to divide the polygon's coordinates into 4 sides which is what we're trying to do, right? Or...do we do min/max to get points that are on each half of the mid-lines (one horizontal midline, one vertical) to determine which points are left side, top side, etc.
  2. Update the geostationary bounding box functions to not only use the number of points as a starting point for the full disk geostationary bounding polygon, but also make sure the subset results return that number of points.
BENR0 commented 1 year ago

I think option 2. would be the cleaner one but might involve more refactoring. My first guess for just solving the problem was more on the line of option 1. if I understand it correctly.