Closed JSAnandEOS closed 5 years ago
the ocean pixels are invalid and so need to be masked.
in the resulting array I can't readily tell which pixels are invalid, and which contain real data. How do I get around this?
Do you mean that the input data (on source grid) are all NaNs cover the ocean region? In that case, the output data will also be NaNs over the ocean, by default. You don't need to apply additional masking. In many cases, "masking" just means "setting NaN to zeros" (https://github.com/JiaweiZhuang/xESMF/issues/22#issuecomment-402320570), which might not be what you actually want.
If you input data do not even cover the ocean region (i.e. a regional grid only over land), but the output grid is global, then the undefined ocean region will have zeros instead of NaNs, by default. To flip this behavior see https://github.com/JiaweiZhuang/xESMF/issues/15#issuecomment-371646763.
Do you mean that the input data (on source grid) are all NaNs cover the ocean region?
In addition to the ocean, there are also certain areas where for whatever reason (say, cloud cover) the data is invalid, so these regions have to be removed from the gridding as well. I have currently set these to NaNs as well. These are different to areas where the data is zero (e.g. deserts), because these values are still valid.
You don't need to apply additional masking. In many cases, "masking" just means "setting NaN to zeros" (#22 (comment)), which might not be what you actually want.
I had originally wanted to use conservative gridding with NaNs and zero values, but I encountered the same problem as #22, where large sections of coastal regions were missing in the final gridded dataset, despite having non-zero input data near those regions. The discussion about "conservative_normed" suggested that I needed to do both masking and setting unwanted areas to NaNs in order to deal with both coastal regions and areas with invalid data.
If I understand correctly, then you need to
Does this produce what you expected?
If I understand you correctly, the regridding should be done like so:
import scipy
import xesmf as xe
import numpy as np
def add_matrix_NaNs(regridder):
X = regridder.A
M = scipy.sparse.csr_matrix(X)
num_nonzeros = np.diff(M.indptr)
M[num_nonzeros == 0, 0] = np.NaN
regridder.A = scipy.sparse.coo_matrix(M)
return regridder
def regrid(ds_in, ds_out, dr_in, method = 'conservative_normed'):
regridder = xe.Regridder(ds_in, ds_out, method, periodic=True, reuse_weights=False)
regridder = add_matrix_NaNs(regridder)
dr_out = regridder(dr_in)
regridder.clean_weight_file()
return dr_out
Is this correct?
Yes this should mark undefined regions as NaNs while keeping real zeros untouched. However it is a very niche edge case, so I am not entirely sure if it is correct. Let me know if it works.
I apologise for the late reply, but I am pleased to report that this solution works. Thanks!
Great! Just notice that 0.2.0 deprecates regridder.A
in favor of regridder.weights
(https://github.com/JiaweiZhuang/xESMF/commit/792e2288f883713ec206c2c837fd3bd6ed345894)
I'd like to have a simpler option in the main branch to set different mask-handling behavior, to avoid this ad-hoc fix from users. But given the subtlety of masking, it probably requires more study. Not having a clear timeline right now.
So I'm using the "conservative_normed" algorithm provided in the "masking" branch of xESMF to grid some MODIS GPP data to a lower spatial resolution (fine to coarse). Being a land-only product, the ocean pixels are invalid and so need to be masked. After masking and running xESMF on the data these regions now appear as zeroes, as expected.
My problem is that valid zero values also exist in the input data over regions with no vegetation (e.g. deserts). Therefore, in the resulting array I can't readily tell which pixels are invalid, and which contain real data. How do I get around this? Is there any way to output a mask of which pixels contain real (i.e. no data was binned at all)? Thanks.