pytroll / pyresample

Geospatial image resampling in Python
http://pyresample.readthedocs.org
GNU Lesser General Public License v3.0
346 stars 94 forks source link

Fix data type handling (complex) in nearest neighbor resampling #498

Closed comma-never-coma closed 1 year ago

comma-never-coma commented 1 year ago

There is something wrong while geocoding a wrapped interferograms in complex type due to ComplexWarning. The change would fix it as below. https://github.com/insarlab/MintPy/issues/955#issuecomment-1423716672

mraspaud commented 1 year ago

Thanks for the contribution! Could you just add a small test checks that the function now work as you expect? Otherwise, it looks good to me.

comma-never-coma commented 1 year ago

Thanks for the contribution! Could you just add a small test checks that the function now work as you expect? Otherwise, it looks good to me.

I'm sorry as I am a newbie in Git, and I just do not clearly know how to add a test here. A tested dataset is mentioned in https://github.com/insarlab/MintPy/issues/955#issuecomment-1423716672.

djhoese commented 1 year ago

@comma-never-coma Could you make a small minimal reproducible example that only uses pyresample and numpy to reproduce the issue? You could create some small input data using np.random or something. How could we (the maintainers or any other user) know that the results are wrong? Is it just the warning you're seeing or is the result actually wrong?

If you can give us that then we can work on adding a unit test to https://github.com/pytroll/pyresample/blob/main/pyresample/test/test_kd_tree.py

comma-never-coma commented 1 year ago

Thanks for your guidance @djhoese. Here I made a test according to a random complex array and line 823,824 in kd_tree.py as below to reproduce my issue.

full_result = np.ones(output_raw_shape) * fill_value full_result[valid_output_index] = result

First I got the required input parameters for line 823,824 in kd_tree.py: result, output_raw_shape, valid_output_index. Here is the code to simulate a data set for testing:

import numpy as np
result = np.random.rand(5) + complex('j') * np.random.rand(5)
output_raw_shape = 5
fill_value = np.nan
valid_output_index = np.array([True, True, True, True, True], dtype=bool)

And the input ndarray result is:

[0.73348926+0.59648065j 0.01834474+0.27493573j 0.12835583+0.89987413j 0.64661849+0.87928246j 0.19402405+0.44471249j]

Then run the preceding commend lines to assign full_result, the output would prompt a ComplexWarning like this:

<input>:2: ComplexWarning: Casting complex values to real discards the imaginary part

And the output ndarray full_result is actually real:

[0.73348926 0.01834474 0.12835583 0.64661849 0.19402405]

We can find that the complex ndarray result lost its imaginary part after the evaluation full_result[valid_output_index] = result. I guess this is because of not considering of the situation to resample a complex image, for example, the wrapped interferogram. To solve this problem, I changed the first commend line to create a complex array:

full_result = np.ones(output_raw_shape, dtype=np.complex64) * fill_value full_result[valid_output_index] = result

In such a condition, I can obtain a complex output full_result as the same as the input result without a ComplexWarning:

[0.7334893 +0.59648067j 0.01834474+0.27493572j 0.12835583+0.89987415j 0.6466185 +0.8792825j 0.19402406+0.4447125j ]

In the end, with the help of @yunjunz , I changed line 823 in kd_tree.py to:

full_result = np.ones(output_raw_shape, dtype=data.dtype) * fill_value

djhoese commented 1 year ago

Thanks. So we could make a test with some complex values and that should be good enough. We'll need to rework what you've shown so it calls the public interfaces like resample_nearest or whatever, but this shouldn't be too bad. I have some meetings and other tasks I need to get done today, but I'll try to have a look if you don't beat me to it.

https://github.com/pytroll/pyresample/blob/b8ecc7104cf132f10dee5438eb3049f526199013/pyresample/kd_tree.py#L61

djhoese commented 1 year ago

@comma-never-coma Ok I added a simple test for complex data types, but while I was doing it I noticed there was another way to solve this problem so I changed the line of code you updated. I noticed in my tests that when we get to the line being changed in this PR that the result array was of the proper data type, the final_result was being created incorrectly, and then later on line conserve_input_data_type it was getting re-cast to the complex data type. I realized the whole point of this if statement with the recasting of the final result when it is False is to handle the cases where we are doing weighted resampling and we may not be able to retain the exact same type as the input (weighted resampling of integers becoming floats, etc). As such, I think it is better to create the final result (filled) array as the same type as the result and let this if statement later on decide if it should be cast to the input data type.

Let me know what you think. @mraspaud what are your thoughts on this PR?

codecov[bot] commented 1 year ago

Codecov Report

Merging #498 (a13a178) into main (b8ecc71) will increase coverage by 0.00%. The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #498   +/-   ##
=======================================
  Coverage   94.33%   94.34%           
=======================================
  Files          74       74           
  Lines       12947    12957   +10     
=======================================
+ Hits        12214    12224   +10     
  Misses        733      733           
Flag Coverage Δ
unittests 94.34% <100.00%> (+<0.01%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pyresample/kd_tree.py 92.02% <100.00%> (ø)
pyresample/test/test_kd_tree.py 97.35% <100.00%> (+0.04%) :arrow_up:

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

coveralls commented 1 year ago

Coverage Status

Coverage: 93.867% (+0.005%) from 93.862% when pulling a13a1780476fcb695abbde37845ee295817aede8 on comma-never-coma:main into b8ecc7104cf132f10dee5438eb3049f526199013 on pytroll:main.

comma-never-coma commented 1 year ago

@djhoese It sounds a more reliable solution. I would be grateful if you could offer help to modify this PR or pull a new PR to solve this problem.

djhoese commented 1 year ago

I would be grateful if you could offer help to modify this PR or pull a new PR to solve this problem.

I'm not sure what you mean. I did push a commit to your branch with the changes/tests I suggested. This PR should be good to go with a review from @mraspaud and/or @pnuu.

comma-never-coma commented 1 year ago

I would be grateful if you could offer help to modify this PR or pull a new PR to solve this problem.

I'm not sure what you mean. I did push a commit to your branch with the changes/tests I suggested. This PR should be good to go with a review from @mraspaud and/or @pnuu.

I'm sorry that I am quite alien about the PR workflow, and I just found your modification in the commit. Thanks a million, I really appreciate it.