Closed avalentino closed 1 year ago
There seems to be also another issue linked to a 32 bit architecture (mipsel this time).
The problem seems to be linked to the _legacy_dask_ewa
extension.
=================================== FAILURES ===================================
_ TestDaskEWAResampler.test_xarray_basic_ewa[100-True-float64-input_shape1-input_dims1-LegacyDaskEWAResampler-pyresample.ewa._legacy_dask_ewa] _
self = <pyresample.test.test_dask_ewa.TestDaskEWAResampler object at 0x6dd1c418>
resampler_class = <class 'pyresample.ewa._legacy_dask_ewa.LegacyDaskEWAResampler'>
resampler_mod = <module 'pyresample.ewa._legacy_dask_ewa' from '/<<PKGBUILDDIR>>/.pybuild/cpython3_3.10_pyresample/build/pyresample/ewa/_legacy_dask_ewa.py'>
input_shape = (3, 100, 50), input_dims = ('bands', 'y', 'x')
input_dtype = <class 'numpy.float64'>, maximum_weight_mode = True
rows_per_scan = 100
@pytest.mark.parametrize(
('resampler_class', 'resampler_mod'),
[
(DaskEWAResampler, dask_ewa),
(LegacyDaskEWAResampler, legacy_dask_ewa),
])
@pytest.mark.parametrize(
('input_shape', 'input_dims'),
[
((100, 50), ('y', 'x')),
((3, 100, 50), ('bands', 'y', 'x')),
]
)
@pytest.mark.parametrize('input_dtype', [np.float32, np.float64, np.int8])
@pytest.mark.parametrize('maximum_weight_mode', [False, True])
@pytest.mark.parametrize('rows_per_scan', [10, 0, 100])
def test_xarray_basic_ewa(self, resampler_class, resampler_mod,
input_shape, input_dims, input_dtype,
maximum_weight_mode, rows_per_scan):
"""Test EWA with basic xarray DataArrays."""
is_legacy = resampler_class is LegacyDaskEWAResampler
is_int = np.issubdtype(input_dtype, np.integer)
if is_legacy and is_int:
pytest.skip("Legacy dask resampler does not properly support "
"integer inputs.")
if is_legacy and rows_per_scan == 0:
pytest.skip("Legacy dask resampler does not support rows_per_scan "
"of 0.")
output_shape = (200, 100)
if len(input_shape) == 3:
output_shape = (input_shape[0], output_shape[0], output_shape[1])
swath_data, source_swath, target_area = get_test_data(
input_shape=input_shape, output_shape=output_shape[-2:],
input_dims=input_dims, input_dtype=input_dtype,
)
num_chunks = _get_num_chunks(source_swath, resampler_class, rows_per_scan)
with mock.patch.object(resampler_mod, 'll2cr', wraps=resampler_mod.ll2cr) as ll2cr, \
mock.patch.object(source_swath, 'get_lonlats', wraps=source_swath.get_lonlats) as get_lonlats:
resampler = resampler_class(source_swath, target_area)
new_data = resampler.resample(swath_data, rows_per_scan=rows_per_scan,
weight_delta_max=40,
maximum_weight_mode=maximum_weight_mode)
_data_attrs_coords_checks(new_data, output_shape, input_dtype, target_area,
'test', 'test')
# make sure we can actually compute everything
new_data.compute()
lonlat_calls = get_lonlats.call_count
ll2cr_calls = ll2cr.call_count
# resample a different dataset and make sure cache is used
swath_data2 = _create_second_test_data(swath_data)
new_data = resampler.resample(swath_data2, rows_per_scan=rows_per_scan,
weight_delta_max=40,
maximum_weight_mode=maximum_weight_mode)
_data_attrs_coords_checks(new_data, output_shape, input_dtype, target_area,
'test2', 'test2')
_coord_and_crs_checks(new_data, target_area,
has_bands='bands' in input_dims)
result = new_data.compute()
# ll2cr will be called once more because of the computation
> assert ll2cr.call_count == ll2cr_calls + num_chunks
E AssertionError: assert 100 == (51 + 50)
E + where 100 = <MagicMock name='ll2cr' id='1756641120'>.call_count
pyresample/test/test_dask_ewa.py:231: AssertionError
So the legacy dask EWA one we've noticed on 64-bit Windows platforms (at least) in our CI. It is on my TODO list to figure that out.
For the other failure...very weird. A single pixel is too different and only by:
E Max absolute difference: 6.839633e-05
E Max relative difference: 0.00032184
Any idea how often it fails versus passes? My guess is it would have to be something with the order that dask computes the individual chunks. I guess the easiest thing would be to increase the threshold for the test, but maybe I spend some time figuring out the legacy dask ewa failure first which at least happens in our CI occasionally.
Any idea how often it fails versus passes?
well, i would say quite often:
test_compare_to_legacy
completely. The rate of failure in this case was more or less 6/10 considering rtol=1e-4. The following run (with test_compare_to_legacy
disabled) still failed in mipsel (32bit) only with the error reported in the previous comment@avalentino I just merged #482 into main which should fix the intermittent test_xarray_basic_ewa
failures. I'm wondering if a similar fix (setting the dask scheduler to sync
) would clear up the issues for the other test. My other guess is that the order of execution of the sums in the EWA algorithm is making a big enough difference that it is showing in the tests. In this case we don't have much of a choice so we may just have to lighten up on the comparison threshold.
@djhoese I run the test_xarray_basic_ewa
test few times on i386 and all seems to work properly.
But I have to say that I never had problems with that test on i386.
I have also tried to set the dask scheduler to "sync" for the test_compare_to_legacy
test but it does not seems to help.
Not sure that my patch is correct anyway.
diff --git a/pyresample/test/test_dask_ewa.py b/pyresample/test/test_dask_ewa.py
index cfb96a7..0ac8286 100644
--- a/pyresample/test/test_dask_ewa.py
+++ b/pyresample/test/test_dask_ewa.py
@@ -333,16 +333,16 @@ class TestDaskEWAResampler:
input_dims=input_dims,
)
swath_data.data = swath_data.data.astype(np.float32)
+ with dask.config.set(scheduler='sync'):
+ resampler = DaskEWAResampler(source_swath, target_area)
+ new_data = resampler.resample(swath_data, rows_per_scan=10,
+ maximum_weight_mode=maximum_weight_mode)
+ new_arr = new_data.compute()
- resampler = DaskEWAResampler(source_swath, target_area)
- new_data = resampler.resample(swath_data, rows_per_scan=10,
- maximum_weight_mode=maximum_weight_mode)
- new_arr = new_data.compute()
-
- legacy_resampler = LegacyDaskEWAResampler(source_swath, target_area)
- legacy_data = legacy_resampler.resample(swath_data, rows_per_scan=10,
- maximum_weight_mode=maximum_weight_mode)
- legacy_arr = legacy_data.compute()
+ legacy_resampler = LegacyDaskEWAResampler(source_swath, target_area)
+ legacy_data = legacy_resampler.resample(swath_data, rows_per_scan=10,
+ maximum_weight_mode=maximum_weight_mode)
+ legacy_arr = legacy_data.compute()
np.testing.assert_allclose(new_arr, legacy_arr)
Yeah that's basically how I would have done it. I mean this change in result by such a small amount could be anything from numpy to dask to the specific process the code is being run on. Especially since it is one pixel I'm not sure it is worth trying to narrow it down rather than changing the threshold/tolerance on the comparison.
OK for me to change the tolerance.
I think this was solved in #482, please tell if it isn't. Closing for now.
Code Sample, a minimal, complete, and verifiable piece of code
Problem description
On 32bit architectures I'm experiencing intermittent failures of the test
pytesample.test.test_dask_ewa.TestDaskEWAResampler.test_compare_to_legacy
Expected Output
All unittest pass.
Actual Result, Traceback if applicable
Versions of Python, package at hand and relevant dependencies
Python 3.10 PyResample 1.26.0