Open raspstephan opened 4 years ago
Thanks for bringing this up. PRs are welcome!
Is there any reason not use
keep_attrs=True
?
The only reason is that xr.apply_ufunc
defaults to keep_attrs=False
and I just keep the defaults. To be consistent with xr.apply_ufunc
, I would suggest an optional keep_attrs
kwarg that defaults to False
, and you can set it to True
if needed.
regridder(indata) # doesn't keep attributes
regridder(indata, keep_attrs=True) # keeps attributes
Regarding the data type, that's because ESMF stores regridding weights in float64
. In numpy, float32 * float64
gives float64
. Changing output_dtypes
won't actually help in this case. Consider this example:
import numpy as np
import xarray as xr
a = np.array([1, 2, 3], dtype=np.float64)
x = np.array([1, 2, 3], dtype=np.float32)
out = a * x
out.dtype # float64
out2 = xr.apply_ufunc(lambda x: a * x, x, output_dtypes=[np.float32])
out2.dtype # still float64
You can cast regridder.weights
to np.float32
, using scipy.sparse.coo_matrix.astype()
. This is actually also useful for nearest neighbor methods where the weights are just 1.0
and can be cast to integers for regridding categorical variables.
Is it useful to have a method to set weights dtype in the Regridder
class? It would be just one line:
def set_dtype(self, dtype)
self.weights = self.weights.astype(dtype)
I created a pull request to implement keep_attrs. The datatype is not such a big issue for me, since it's just as easy to convert the data afterwards.
Currently, the regridding seems to delete the attributes of the original dataset. I assume the happens during
xr.apply_ufunc
. Is there any reason not usekeep_attrs=True
?Similarly, all data is converted to 64 bit floats, even if the input data is 32 bit. Would it be reasonable to use
output_dtypes=[dr_in.dtype]
instead ofoutput_dtypes=[float]
?I am happy to create a pull request of this if nothing speaks against these changes.