pytroll / pyresample

Geospatial image resampling in Python
http://pyresample.readthedocs.org
GNU Lesser General Public License v3.0
346 stars 94 forks source link

EWA resampling in 1.27 slows down four times than 1.26.1 #517

Closed yukaribbba closed 1 year ago

yukaribbba commented 1 year ago

Code Sample, a minimal, complete, and verifiable piece of code

Some regular satpy codes here

area_id = "abc"
projection = some proj4 strings
area_def = create_area_def(area_id=area_id, projection=projection, resolution=250)
scn = scn.resample(area_def, resampler="ewa", weight_delta_max=10, weight_distance_max=1)

Problem description

To produce a 250m rayleigh-corrected true color scene from MODIS L1B: 1.26.1: 83.054s (with CPU almost fully loaded) 1.27: 321.692s (CPU usage around 20% all the time)

Versions of Python, package at hand and relevant dependencies

python: 3.11.3 satpy: 0.42.3-dev dask: 2023.5.0

yukaribbba commented 1 year ago

I see. I agree that we'd better not do that round for precision reasons. So what about the quality?

djhoese commented 1 year ago

What do you mean quality? I think rounding the stuff in the dynamic area definition to a certain number of decimal points is fine. Especially since it will have the explicit purpose of preserving what the user asked for.

djhoese commented 1 year ago

Hm even when I do less math to get the extent by doing:

            left_x = corners[0] - x_resolution / 2
            bot_y = corners[1] - y_resolution / 2
            area_extent = (left_x,
                           bot_y,
                           left_x + width * x_resolution,
                           bot_y + height * y_resolution)

inside the DynamicAreaDefinition I still get a little floating point precision error at the end.

yukaribbba commented 1 year ago

Well actually I was talking about the image quality there...

djhoese commented 1 year ago

Yeah, sorry, I wasn't confusing rounding and quality, I just wasn't sure what your concern was with quality. I assumed that this new version of the code where both resolutions (x and y) end up really really close to the requested resolution meant the resulting image was equivalent to P2G v2.3.

yukaribbba commented 1 year ago

I assumed that this new version of the code where both resolutions (x and y) end up really really close to the requested resolution meant the resulting image was equivalent to P2G v2.3

Yeah I was also hoping that but.... This code: image P2G 2.3: image

Looks like the resolution and the quality are two things here.

djhoese commented 1 year ago

What instrument is this? How did you generate the bad quality one? Same resampling algorithm? What is the geolocation information in the geotiff (resolution, projection, size, etc)?

yukaribbba commented 1 year ago

Oh sorry it's the same modis scene as above. The code/command lines are also same here.

djhoese commented 1 year ago

The default true color for MODIS in satpy is not sharpened and does not match what P2G 3.0 does sharpening-wise. Therefore I would expect reduced quality.

yukaribbba commented 1 year ago

Hmmm that's a P2G 2.3 image. Does 2.3 also do the sharpening?

yukaribbba commented 1 year ago

So I got another test on the true color but with RatioSharpenCompositor. The code and scene remain same.

RatioSharpenCompositor image

GenericCompositor image

P2G 2.3 image

Whether it's sharpened or not, it looks blurred compared with the 2.3 version.

djhoese commented 1 year ago

What does your sharpened YAML definition look like? What files are you providing to Satpy and Polar2Grid?

yukaribbba commented 1 year ago

Whoops....I got wrong YAML file which I tested for the neutral_resolution...Sorry for the mess... After sharpening it does look like 2.3.

yukaribbba commented 1 year ago

Back to that round topic. Can we expand the area just a little larger instead of rounding it? Like:

# extent calculated by DynamicArea
min_x = -1977413.430
min_y = 2143765.270
max_x = 849533.688
max_y = 4596403.063
res = 250

if (max_x - max_y) % res != 0:
    min_x = round(min_x)
    max_x = round(max_x)
    rem = (max_x - min_x) % res
    add = 2 * res - rem
    if add % 2 == 0:
        min_x = min_x - add / 2
        max_x = max_x + add / 2
    else:
        min_x = min_x - (add + 1) / 2
        max_x = max_x + (add - (add + 1) / 2)
print(min_x, max_x)
-1977565.0 849685.0

This way we can find the extent values that are just evenly divisible by the resolution, without hurting any precisions.

djhoese commented 1 year ago

I'm not sure I understand the math here (mainly the last if/else block). I'm also not sure how safe doing round(min_x) is for degrees projections or any other units that might come up (kilometers). What about a "rounding" the extents to the nearest multiple of resolution using floor and ceil:

min_x = -1977413.430
min_y = 2143765.270
max_x = 849533.688
max_y = 4596403.063
res = 250

import math

math.floor(min_x / res) * res
# -1977500

math.ceil(max_x / res) * res
# 849750
djhoese commented 1 year ago

Or I guess we could even do half or quarter of res to not do entire pixel shifts of the extent.

yukaribbba commented 1 year ago

Ok I got a little fix here.

# extent calculated by DynamicArea
min_x = -1977413.430
min_y = 2143765.270
max_x = 849533.688
max_y = 4596403.063
res = 250

if (max_x - min_x) % res != 0:
    from decimal import Decimal
    min_x = Decimal(str(min_x))
    max_x = Decimal(str(max_x))
    res = Decimal(str(res))
    # this is the gaps between the ideal extent and the reality
    rem = (max_x - min_x) % res
    # we'll expand the whole scene by 2 pixels in width, so we need to add some values
    add = 2 * res - rem
    # split the add value into 2 parts, one for min_x, another for max_x
    min_x = min_x - add / 2
    max_x = max_x + add / 2
    print(min_x, max_x)

-1977564.871 849685.129

It's also fit for decimal degrees.

min_x = -20.023674
min_y = -4.375472
max_x = 70.547499
max_y = 25.520567
res = 0.00225

-20.0249625 70.5487875
djhoese commented 1 year ago

You might have to adjust your test case. Your rem ends up being 0.000. I'm guessing Decimal is needed to try to preserve precision? Your if statement uses x and y which I'm guessing is a typo? I'm also thinking we may not need to do the if statement at all and just always apply this rounding (if it isn't needed the math will work that out). Another thought is, does it have to be 2 pixels in add?

I like the cleanliness of my ceil/floor solution, but yeah it could be surprising to some users (maybe?). Oh actually...I think my solution is the same as yours but instead of splitting add in half it splits it to make the extents "clean" numbers aligned with the resolution. I feel like not having any decimal portion is maybe the best way to "guarantee" no floating point issues.

yukaribbba commented 1 year ago

That's a typo. I just edited it. Decimal' s precission could be very high at least I tried print(Decimal('3.1415926535684573278456894357832784785478478666666666666667')) and got the same. We don't need to have 2 pixels. One could be enough. But finally I think you're right. It's better to leave decimal degree alone. So your code is better in this case.