pytroll / pyresample

Geospatial image resampling in Python
http://pyresample.readthedocs.org
GNU Lesser General Public License v3.0
344 stars 95 forks source link

Fix: data_reduce with Gaussian reduced grids #521

Closed AlexandreMary closed 1 year ago

AlexandreMary commented 1 year ago

Resampling from Gaussian reduced grids (seen as a 2D masked array with missing values at the end of each row) was failing, trying to reduce data. This correction (using np.round instead of round) escapes the reduction in this case, and the resampling then works fine (albeit slower than optimal maybe). Open to any better option.

djhoese commented 1 year ago

Thanks for putting a fix together. Just to make sure I understand, it isn't that this broke in some way, it is that this never worked at all, right?

For your data, the problem is the edge longitudes are masked and/or NaNs, right? So the final angle_sum is probably masked or NaN? What is the end result you're hoping for? Should it do any data reduction? If so, how does that work with these masked values? I suppose the invalid lon/lat check near the top of the function needs to be updated to handle masked arrays in which case your data would get returned with no reduction, right?

And you say it was failing, were you getting an error? Could you paste it here so others will find this PR when googling the error message?

AlexandreMary commented 1 year ago

Yes, I think it never worked. Indeed the edge lons/lats are Masked, ending up in angle_sum being MaskedConstant and the resulting error: TypeError: type MaskedConstant doesn't define __round__ method

From what I understand of this data reduction, indeed, it would require further development to implement an actual reduction in this case, which is probably not a priority. So indeed, I'm happy with no data reduction (performance-wise).

An update of the detection of illegal lons/lats could also be a way of working around the issue, indeed.

Thanks

AlexandreMary commented 1 year ago

Unless you can suggest me another way of dealing with Reduced Gauss grids (such as IFS or Arpege NWP models')...

coveralls commented 1 year ago

Coverage Status

coverage: 93.846%. remained the same when pulling 12333552dd0c13f361e316ba40cb6cb24048b83b on AlexandreMary:fix_gauss_grids into 2b94b704365507a3ac9b2de9e9a25e5414792b73 on pytroll:main.

codecov[bot] commented 1 year ago

Codecov Report

Merging #521 (1233355) into main (2b94b70) will not change coverage. The diff coverage is 100.00%.

:exclamation: Current head 1233355 differs from pull request most recent head b8c9087. Consider uploading reports for the commit b8c9087 to get more accurate results

@@           Coverage Diff           @@
##             main     #521   +/-   ##
=======================================
  Coverage   94.32%   94.32%           
=======================================
  Files          79       79           
  Lines       12944    12944           
=======================================
  Hits        12209    12209           
  Misses        735      735           
Flag Coverage Δ
unittests 94.32% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pyresample/data_reduce.py 94.44% <100.00%> (ø)
djhoese commented 1 year ago

Unless you can suggest me another way of dealing with Reduced Gauss grids (such as IFS or Arpege NWP models')...

I understand about half of what you said so probably not.

I think I'd prefer instead of this np.round change to change it to be more explicit and check for masked values or NaNs. You could/should maybe do a .filled(np.nan) call on the lon/lat side checks before any() is called. At least the possibility of a masked array needs to be handled.

AlexandreMary commented 1 year ago

OK, I removed the np.round towards a detection of the lons_side* being actually masked arrays. More clean, indeed.

AlexandreMary commented 1 year ago

OK, I tried to add a test (in test_data_reduce) but can't figure out how it works exactly, valid_index is 1D and bigger than the first dimension (nlats):

    def test_area_con_reducedgauss_reduce(self):
        nlats = 2000
        max_nlons = 2 * nlats
        lats_resolution = 180 / (nlats + 1)
        lons_resolution = 360 / max_nlons
        lons = np.ma.masked_all((nlats, max_nlons))
        lats = np.ma.masked_all((nlats, max_nlons))
        for i in range(nlats):
            nlons = max(8, 2 * i)
            lats[i, :] = -90 + (i + 1) * lats_resolution
            for j in range(nlons):
                lons[i, j] = j * lons_resolution
        reduced_gauss_grid_def = geometry.GridDefinition(lons, lats)
        data = np.fromfunction(lambda y, x: (y + x), (nlats, max_nlons))
        lons = np.fromfunction(
            lambda y, x: -180 + (360.0 / 100) * x, (100, 100))
        lats = np.fromfunction(
            lambda y, x: -90 + (180.0 / 100) * y, (100, 100))
        grid_lons, grid_lats = reduced_gauss_grid_def.get_boundary_lonlats()
        valid_index = get_valid_index_from_lonlat_boundaries(grid_lons, grid_lats,
                                                             lons, lats, 7000)
        data = data[valid_index]
        cross_sum = data.sum()
        expected = 20685125.0
        self.assertAlmostEqual(cross_sum, expected)

But after all does it make sense to make a test on a case we know is not implemented for data reduction ? More radically, we could also raise an error in case of masked grid, and so use reduce_data=False in the calling kd_tree functions (get_neighbour_info).

djhoese commented 1 year ago

Sorry for the slow reply. I was on a vacation. What do you mean a case that is not implemented?

For the 1D issue and how it works...I'm honestly not sure as I haven't used this function before or at least not directly. It may be that it is a flattened version of the array.

AlexandreMary commented 1 year ago

I meant that the data reduction in the case of a masked grid is not implemented (as we end up with masked constants or NaN in the angle_sum). A cleaner fix would be to implement this data reduction for masked grids, but that is probably not a priority, for me nor for you.

So in the end, I can live without the fix, calling get_neighbour_info(..., data_reduction=False) and we can close this PR.

djhoese commented 1 year ago

I'm tempted to say this is such a simple fix that we should include it. You somehow stumbled on it so it might come up again later. Having a test for it ensures (or at least it should) that the functionality is retained between refactors.

djhoese commented 1 year ago

If you have no desire to continue the fix for this and will just workaround it then I'm OK closing it. I have no use for it myself so I can't pretend I'll volunteer to work on or maintain it. We can reopen this if you'd like.

AlexandreMary commented 1 year ago

OK for me. Thanks