pytroll / pyresample

Geospatial image resampling in Python
http://pyresample.readthedocs.org
GNU Lesser General Public License v3.0
348 stars 94 forks source link

Resampling of swath data - result shape difference #72

Closed bmurch closed 6 years ago

bmurch commented 7 years ago

Hello, I'm new to this, but we are looking at using pyresample to speed up our mapping routines of h5 L2 satellite data from polar orbiters. We are currently using GPT provided in BEAM which is java based and very VERY slow.

However, when I tried to use your samples as in the docs:

#!/usr/bin/env python3

import pyresample as pr
from pyresample import kd_tree, image, bilinear, geometry
import numpy as np

area_def = geometry.AreaDefinition('areaD', 'Europe (3km, HRV, VTC)', 'areaD',{'a': '6378144.0', 'b': '6356759.0','lat_0': '50.00', 'lat_ts': '50.00','lon_0': '8.00', 'proj': 'stere'},800, 800,[-1370912.72, -909968.64,1029087.28, 1490031.36])
data = np.fromfunction(lambda y, x: y*x, (50, 10))
lons = np.fromfunction(lambda y, x: 3 + x, (50, 10))
lats = np.fromfunction(lambda y, x: 75 - y, (50, 10))
swath_def = geometry.SwathDefinition(lons=lons, lats=lats)

#using pyresample.image
swath_con = image.ImageContainerNearest(data, swath_def, radius_of_influence=5000)
area_con = swath_con.resample(area_def)
swath_con_result = area_con.image_data

#using pyresample.kd_tree
resample_nearest_result = kd_tree.resample_nearest(swath_def, data, area_def, radius_of_influence=50000,epsilon=10)
resample_gauss_result = kd_tree.resample_gauss(swath_def, data, area_def, radius_of_influence=50000,sigmas=100)

#using pyresample.bilinear (it appears that the gridding does not take place?
resample_bilinear_result = bilinear.resample_bilinear(data, swath_def, area_def,radius=30000, neighbours=15,nprocs=1, fill_value=0,reduce_data=True, segments=None,epsilon=0)

print("Starting data:")
print("data.min(): ", data.min())
print("data.max(): ", data.max())
print()
print("image.ImageContainerNearest method:")
print("swath_con_result.shape: ", swath_con_result.shape)
print("swath_con_result.size: ", swath_con_result.size)
print("swath_con_result.min(): ",swath_con_result.min())
print("swath_con_result.max(): ",swath_con_result.max())
print()
print("kd_tree.resample_nearest method:")
print("resample_nearest_result.shape: ", resample_nearest_result.shape)
print("resample_nearest_result.size: ", resample_nearest_result.size)
print("resample_nearest_result.min(): ", resample_nearest_result.min())
print("resample_nearest_result.max(): ", resample_nearest_result.max())
print()
print("kd_tree.resample_gauss method:")
print("resample_gauss_result.shape: ", resample_gauss_result.shape)
print("resample_gauss_result.size: ", resample_gauss_result.size)
print("resample_gauss_result.min(): ", resample_gauss_result.min())
print("resample_gauss_result.max(): ", resample_gauss_result.max())
print()
print("bilinear.resample_bilinear method:")
print("resample_bilinear_result.shape: ", resample_bilinear_result.shape)
print("resample_bilinear_result.size: ", resample_bilinear_result.size)
print("resample_bilinear_result.min(): ", resample_bilinear_result.min())
print("resample_bilinear_result.max(): ", resample_bilinear_result.max())

The results.shape is unexpected for the resample_bilinear_result when I run the resample_test.py file from the code above eg:

./resample_tests.py Starting data: data.min(): 0.0 data.max(): 441.0

image.ImageContainerNearest method: swath_con_result.shape: (800, 800) swath_con_result.size: 640000 swath_con_result.min(): 0.0 swath_con_result.max(): 297.0

kd_tree.resample_nearest method: resample_nearest_result.shape: (800, 800) resample_nearest_result.size: 640000 resample_nearest_result.min(): 0.0 resample_nearest_result.max(): 297.0

kd_tree.resample_gauss method: resample_gauss_result.shape: (800, 800) resample_gauss_result.size: 640000 resample_gauss_result.min(): 0.0 resample_gauss_result.max(): 297.0

bilinear.resample_bilinear method: resample_bilinear_result.shape: (640000, 1) resample_bilinear_result.size: 640000 resample_bilinear_result.min(): 0.0 resample_bilinear_result.max(): 0.0

And regardless of what I change in the bilinear.resample_bilinear method, I always get 0 min and max vals and I'm only guessing, but is that because the data isn't gridded at this point?

Thanks, Brock

bmurch commented 7 years ago

Further, I tried to do it the Resampling from bilinear coefficients way from the documentation:

import numpy as np
from pyresample import bilinear, geometry
target_def = geometry.AreaDefinition('areaD', 'Europe (3km, HRV, VTC)', 'areaD', {'a': '6378144.0', 'b': '6356759.0','lat_0': '50.00', 'lat_ts': '50.00','lon_0': '8.00', 'proj': 'stere'}, 800, 800,[-1370912.72, -909968.64, 1029087.28, 1490031.36])
data = np.fromfunction(lambda y, x: y*x, (50, 10))
lons = np.fromfunction(lambda y, x: 3 + x, (50, 10))
lats = np.fromfunction(lambda y, x: 75 - y, (50, 10))
source_def = geometry.SwathDefinition(lons=lons, lats=lats)
t_params, s_params, input_idxs, idx_ref = bilinear.get_bil_info(source_def, target_def, radius=50e3, nprocs=1)
res = bilinear.get_sample_from_bil_info(data.ravel(), t_params, s_params, input_idxs, idx_ref)
print("res.shape: ", res.shape)
print("res.min(): ", res.min())
print("res.max(): ", res.max())
# Adding in the undocumented output_shape
res = bilinear.get_sample_from_bil_info(data.ravel(), t_params, s_params,input_idxs, idx_ref, output_shape=target_def.shape)
print("res.shape: ", res.shape)
print("res.min(): ", res.min())
print("res.max(): ", res.max())

For which I get: res.shape: (640000,) res.min(): nan res.max(): nan

and while it is the proper shape no data res.shape: (800, 800) res.min(): nan res.max(): nan

When I try my own data: result = bilinear.resample_bilinear(np_chlor_a, swath_def, area_def, radius=2000, neighbours=5, fill_value=-32767, reduce_data=True, segments=None, epsilon=0) /home/b/bmurch/.local/lib/python3.5/site-packages/pyresample/kd_tree.py:383: UserWarning: Possible more than 5 neighbours within 2000 m for some data points (neighbours, radius_of_influence))

I get: result.shape: (1210000, 1) result.min(): -32767.0 result.max(): 11.733893451624784

When I use my data in second method: t_params, s_params, input_idxs, idx_ref = bilinear.get_bil_info(swath_def, area_def, radius=2000, nprocs=1) res = bilinear.get_sample_from_bil_info(np_chlor_a.ravel(), t_params, s_params,input_idxs, idx_ref, output_shape=area_def.shape)

I get: res.shape: (1100, 1100) res.min(): nan res.max(): nan

I should mention original data: np_chlor_a.max() : 58.126575 np_chlor_a.min(): -32767.0

djhoese commented 6 years ago

There is currently a pending pull request to fix some of the NaN issues (see #75). For the unexpected shape maybe @pnuu can give some hint as to why this is happening.

pnuu commented 6 years ago

The problem here is that the input data is too sparse compared to the output grid, and there are not enough neighbors inside the search radius to get valid values. My error, evidently. I'll update the documentation ASAP :-) Using these data, lons and lats, you should get a proper result:

data = np.fromfunction(lambda y, x: y*x, (500, 100)) lons = np.fromfunction(lambda y, x: 3 + x * 0.1, (500, 100)) lats = np.fromfunction(lambda y, x: 75 - y * 0.1, (500, 100))

As for your own data, I'll assume the same is happening. Also, you should allow more then 5 neighbors to be certain that the target pixels are having neighbors all around them (with both smaller and larger longitudes and latitudes). Try with the default 32 neighbors and after that increase the search radius. It should be always OK to use way-too-large radius to ensure the proper neighbors. I hope...

bmurch commented 6 years ago

@pnuu I gave your suggestions a try:

#!/usr/bin/env python3

import pyresample as pr
from pyresample import bilinear, geometry
import numpy as np

target_def = geometry.AreaDefinition('areaD', 'Europe (3km, HRV, VTC)', 'areaD', {'a': '6378144.0', 'b': '6356759.0','lat_0': '50.00', 'lat_ts': '50.00', 'lon_0': '8.00', 'proj': 'stere'}, 800, 800,[-1370912.72, -909968.64, 1029087.28, 1490031.36])

data = np.fromfunction(lambda y, x: y*x, (500, 100))

lons = np.fromfunction(lambda y, x: 3 + x * 0.1, (500, 100))

lats = np.fromfunction(lambda y, x: 75 - y * 0.1, (500, 100))

source_def = geometry.SwathDefinition(lons=lons, lats=lats)

result = bilinear.resample_bilinear(data, source_def, target_def, radius=50e3, neighbours=32, nprocs=1, fill_value=0, reduce_data=True, segments=None, epsilon=0)
print("result.shape: ", result.shape)

It produced: $ ./resample_bilinear.py /home/b/bmurch/.local/lib/python3.5/site-packages/pyresample/kd_tree.py:383: UserWarning: Possible more than 32 neighbours within 50000.0 m for some data points (neighbours, radius_of_influence)) result.shape: (640000, 1)

pnuu commented 6 years ago

Oh, yes, sorry, forgot to take a look at the resample_bilinear() bit! I'll have to take a closer look tomorrow at work, maybe I can remember why it's like that :-)

pnuu commented 6 years ago

It seems that I've just forgot to add reshape. I'll add that and create a PR.

mraspaud commented 6 years ago

@bmurch I just released 1.7.0, so you should be able to pip install it.

Best regards Martin

On Thu, 12 Oct 2017 at 18:04 bmurch notifications@github.com wrote:

@pnuu https://github.com/pnuu I'm new to python, but since I don't have control over the python install on the University's cluster, how long does it take for this roll into something that pip can get with a "pip install pyresample --update"?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pytroll/pyresample/issues/72#issuecomment-336184509, or mute the thread https://github.com/notifications/unsubscribe-auth/AAKPeh4wsesx6KlMvdjMLIzx2njcy3KMks5srjhzgaJpZM4PmMiK .

-- Martin Raspaud, PhD Software engineer SMHI, SE-60176