pangeo-data / xESMF

Universal Regridder for Geospatial Data
http://xesmf.readthedocs.io/
MIT License
194 stars 35 forks source link

error in regridding chla data with xesmf #392

Closed onion5376 closed 4 weeks ago

onion5376 commented 4 weeks ago

I have install xesmf in a new, clean environment under centos 9 , based on the tech doc (https://xesmf.readthedocs.io/en/latest/installation.html). This is the data used in script(chla201601.zip

import numpy as np
import xarray as xr
import xesmf as xe
import matplotlib.pyplot as plt
import cartopy.crs as ccrs

ds = xr.open_dataset("chla201601.nc")
ds_out = xr.Dataset({
    "lat":(["lat"],np.arange(2+0.08333333/2,25.5,0.08333333),{"units":"degrees_north"}),
    "lon":(["lon"],np.arange(105.5+0.08333333/2,120.5,0.08333333),{"units":"degrees_east"}),
})
Regrd = xe.Regridder(ds, ds_out, "conservative")
dr_out = Regrd(ds)

The error is shown as following:

> ---------------------------------------------------------------------------
> RuntimeError                              Traceback (most recent call last)
> Cell In[15], line 1
> ----> 1 dr_out = Regrd(ds)
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xesmf/frontend.py:548](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xesmf/frontend.py#line=547), in BaseRegridder.__call__(self, indata, keep_attrs, skipna, na_thres, output_chunks)
>     540     return self.regrid_dataarray(
>     541         indata,
>     542         keep_attrs=keep_attrs,
>    (...)
>     545         output_chunks=output_chunks,
>     546     )
>     547 elif isinstance(indata, xr.Dataset):
> --> 548     return self.regrid_dataset(
>     549         indata,
>     550         keep_attrs=keep_attrs,
>     551         skipna=skipna,
>     552         na_thres=na_thres,
>     553         output_chunks=output_chunks,
>     554     )
>     555 else:
>     556     raise TypeError('input must be numpy array, dask array, xarray DataArray or Dataset!')
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xesmf/frontend.py:687](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xesmf/frontend.py#line=686), in BaseRegridder.regrid_dataset(self, ds_in, keep_attrs, skipna, na_thres, output_chunks)
>     680 non_regriddable = [
>     681     name
>     682     for name, data in ds_in.data_vars.items()
>     683     if not set(input_horiz_dims).issubset(data.dims)
>     684 ]
>     685 ds_in = ds_in.drop_vars(non_regriddable)
> --> 687 ds_out = xr.apply_ufunc(
>     688     self.regrid_array,
>     689     ds_in,
>     690     self.weights,
>     691     kwargs=kwargs,
>     692     input_core_dims=[input_horiz_dims, ('out_dim', 'in_dim')],
>     693     output_core_dims=[temp_horiz_dims],
>     694     dask='allowed',
>     695     keep_attrs=keep_attrs,
>     696 )
>     698 return self._format_xroutput(ds_out, temp_horiz_dims)
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py:1265](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py#line=1264), in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, meta, dask_gufunc_kwargs, on_missing_core_dim, *args)
>    1263 # feed datasets apply_variable_ufunc through apply_dataset_vfunc
>    1264 elif any(is_dict_like(a) for a in args):
> -> 1265     return apply_dataset_vfunc(
>    1266         variables_vfunc,
>    1267         *args,
>    1268         signature=signature,
>    1269         join=join,
>    1270         exclude_dims=exclude_dims,
>    1271         dataset_join=dataset_join,
>    1272         fill_value=dataset_fill_value,
>    1273         keep_attrs=keep_attrs,
>    1274         on_missing_core_dim=on_missing_core_dim,
>    1275     )
>    1276 # feed DataArray apply_variable_ufunc through apply_dataarray_vfunc
>    1277 elif any(isinstance(a, DataArray) for a in args):
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py:536](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py#line=535), in apply_dataset_vfunc(func, signature, join, dataset_join, fill_value, exclude_dims, keep_attrs, on_missing_core_dim, *args)
>     531 list_of_coords, list_of_indexes = build_output_coords_and_indexes(
>     532     args, signature, exclude_dims, combine_attrs=keep_attrs
>     533 )
>     534 args = tuple(getattr(arg, "data_vars", arg) for arg in args)
> --> 536 result_vars = apply_dict_of_variables_vfunc(
>     537     func,
>     538     *args,
>     539     signature=signature,
>     540     join=dataset_join,
>     541     fill_value=fill_value,
>     542     on_missing_core_dim=on_missing_core_dim,
>     543 )
>     545 out: Dataset | tuple[Dataset, ...]
>     546 if signature.num_outputs > 1:
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py:460](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py#line=459), in apply_dict_of_variables_vfunc(func, signature, join, fill_value, on_missing_core_dim, *args)
>     458 core_dim_present = _check_core_dims(signature, variable_args, name)
>     459 if core_dim_present is True:
> --> 460     result_vars[name] = func(*variable_args)
>     461 else:
>     462     if on_missing_core_dim == "raise":
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py:742](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py#line=741), in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, vectorize, keep_attrs, dask_gufunc_kwargs, *args)
>     735 broadcast_dims = tuple(
>     736     dim for dim in dim_sizes if dim not in signature.all_core_dims
>     737 )
>     738 output_dims = [broadcast_dims + out for out in signature.output_core_dims]
>     740 input_data = [
>     741     (
> --> 742         broadcast_compat_data(arg, broadcast_dims, core_dims)
>     743         if isinstance(arg, Variable)
>     744         else arg
>     745     )
>     746     for arg, core_dims in zip(args, signature.input_core_dims, strict=True)
>     747 ]
>     749 if any(is_chunked_array(array) for array in input_data):
>     750     if dask == "forbidden":
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py:663](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py#line=662), in broadcast_compat_data(variable, broadcast_dims, core_dims)
>     658 def broadcast_compat_data(
>     659     variable: Variable,
>     660     broadcast_dims: tuple[Hashable, ...],
>     661     core_dims: tuple[Hashable, ...],
>     662 ) -> Any:
> --> 663     data = variable.data
>     665     old_dims = variable.dims
>     666     new_dims = broadcast_dims + core_dims
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/variable.py:451](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/variable.py#line=450), in Variable.data(self)
>     449     return self._data
>     450 elif isinstance(self._data, indexing.ExplicitlyIndexed):
> --> 451     return self._data.get_duck_array()
>     452 else:
>     453     return self.values
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:837](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py#line=836), in MemoryCachedArray.get_duck_array(self)
>     836 def get_duck_array(self):
> --> 837     self._ensure_cached()
>     838     return self.array.get_duck_array()
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:831](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py#line=830), in MemoryCachedArray._ensure_cached(self)
>     830 def _ensure_cached(self):
> --> 831     self.array = as_indexable(self.array.get_duck_array())
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:788](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py#line=787), in CopyOnWriteArray.get_duck_array(self)
>     787 def get_duck_array(self):
> --> 788     return self.array.get_duck_array()
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:651](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py#line=650), in LazilyIndexedArray.get_duck_array(self)
>     647     array = apply_indexer(self.array, self.key)
>     648 else:
>     649     # If the array is not an ExplicitlyIndexedNDArrayMixin,
>     650     # it may wrap a BackendArray so use its __getitem__
> --> 651     array = self.array[self.key]
>     653 # self.array[self.key] is now a numpy array when
>     654 # self.array is a BackendArray subclass
>     655 # and self.key is BasicIndexer((slice(None, None, None),))
>     656 # so we need the explicit check for ExplicitlyIndexed
>     657 if isinstance(array, ExplicitlyIndexed):
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/backends/netCDF4_.py:100](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/backends/netCDF4_.py#line=99), in NetCDF4ArrayWrapper.__getitem__(self, key)
>      99 def __getitem__(self, key):
> --> 100     return indexing.explicit_indexing_adapter(
>     101         key, self.shape, indexing.IndexingSupport.OUTER, self._getitem
>     102     )
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:1015](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py#line=1014), in explicit_indexing_adapter(key, shape, indexing_support, raw_indexing_method)
>     993 """Support explicit indexing by delegating to a raw indexing method.
>     994 
>     995 Outer and[/or](http://localhost:8889/or) vectorized indexers are supported by indexing a second time
>    (...)
>    1012 Indexing result, in the form of a duck numpy-array.
>    1013 """
>    1014 raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support)
> -> 1015 result = raw_indexing_method(raw_key.tuple)
>    1016 if numpy_indices.tuple:
>    1017     # index the loaded np.ndarray
>    1018     indexable = NumpyIndexingAdapter(result)
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/backends/netCDF4_.py:113](http://localhost:8889/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/backends/netCDF4_.py#line=112), in NetCDF4ArrayWrapper._getitem(self, key)
>     111     with self.datastore.lock:
>     112         original_array = self.get_array(needs_lock=False)
> --> 113         array = getitem(original_array, key)
>     114 except IndexError:
>     115     # Catch IndexError in netCDF4 and return a more informative
>     116     # error message.  This is most often called when an unsorted
>     117     # indexer is used before the data is loaded from disk.
>     118     msg = (
>     119         "The indexing operation you are attempting to perform "
>     120         "is not valid on netCDF4.Variable object. Try loading "
>     121         "your data into memory first by calling .load()."
>     122     )
> 
> File src[/netCDF4/_netCDF4.pyx:4981](http://localhost:8889/netCDF4/_netCDF4.pyx#line=4980), in netCDF4._netCDF4.Variable.__getitem__()
> 
> File src[/netCDF4/_netCDF4.pyx:5953](http://localhost:8889/netCDF4/_netCDF4.pyx#line=5952), in netCDF4._netCDF4.Variable._get()
> 
> File src[/netCDF4/_netCDF4.pyx:2113](http://localhost:8889/netCDF4/_netCDF4.pyx#line=2112), in netCDF4._netCDF4._ensure_nc_success()
> 
> RuntimeError: NetCDF: HDF error

chla201601.zip

aulemahal commented 4 weeks ago

Hi @onion5376 !

Sadly, I was not able to reproduce the bug. Your code runs ok on my machine. I have:

xesmf    0.8.7
xarray   2023.8.0
numpy    1.24.4
netCDF4  1.6.4
h5netcdf 1.2.0

This is a wild guess, but if you can't update your environment, you could try opening the netCDF with another backend to see if the problem persists :

ds = xr.open_dataset("chla201601.nc", engine='h5netcdf')
onion5376 commented 4 weeks ago

Hi @aulemahal. (1)When excuting the last line of code(dr_out = Regrd(ds)), the above error occurs. Follow your instruction, ds = xr.open_dataset("chla201601.nc", engine='h5netcdf'), it get another error, as followings:

> ---------------------------------------------------------------------------
> OSError                                   Traceback (most recent call last)
> Cell In[7], line 1
> ----> 1 dr_out = Regrd(ds)
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xesmf/frontend.py:548](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xesmf/frontend.py#line=547), in BaseRegridder.__call__(self, indata, keep_attrs, skipna, na_thres, output_chunks)
>     540     return self.regrid_dataarray(
>     541         indata,
>     542         keep_attrs=keep_attrs,
>    (...)
>     545         output_chunks=output_chunks,
>     546     )
>     547 elif isinstance(indata, xr.Dataset):
> --> 548     return self.regrid_dataset(
>     549         indata,
>     550         keep_attrs=keep_attrs,
>     551         skipna=skipna,
>     552         na_thres=na_thres,
>     553         output_chunks=output_chunks,
>     554     )
>     555 else:
>     556     raise TypeError('input must be numpy array, dask array, xarray DataArray or Dataset!')
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xesmf/frontend.py:687](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xesmf/frontend.py#line=686), in BaseRegridder.regrid_dataset(self, ds_in, keep_attrs, skipna, na_thres, output_chunks)
>     680 non_regriddable = [
>     681     name
>     682     for name, data in ds_in.data_vars.items()
>     683     if not set(input_horiz_dims).issubset(data.dims)
>     684 ]
>     685 ds_in = ds_in.drop_vars(non_regriddable)
> --> 687 ds_out = xr.apply_ufunc(
>     688     self.regrid_array,
>     689     ds_in,
>     690     self.weights,
>     691     kwargs=kwargs,
>     692     input_core_dims=[input_horiz_dims, ('out_dim', 'in_dim')],
>     693     output_core_dims=[temp_horiz_dims],
>     694     dask='allowed',
>     695     keep_attrs=keep_attrs,
>     696 )
>     698 return self._format_xroutput(ds_out, temp_horiz_dims)
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py:1265](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py#line=1264), in apply_ufunc(func, input_core_dims, output_core_dims, exclude_dims, vectorize, join, dataset_join, dataset_fill_value, keep_attrs, kwargs, dask, output_dtypes, output_sizes, meta, dask_gufunc_kwargs, on_missing_core_dim, *args)
>    1263 # feed datasets apply_variable_ufunc through apply_dataset_vfunc
>    1264 elif any(is_dict_like(a) for a in args):
> -> 1265     return apply_dataset_vfunc(
>    1266         variables_vfunc,
>    1267         *args,
>    1268         signature=signature,
>    1269         join=join,
>    1270         exclude_dims=exclude_dims,
>    1271         dataset_join=dataset_join,
>    1272         fill_value=dataset_fill_value,
>    1273         keep_attrs=keep_attrs,
>    1274         on_missing_core_dim=on_missing_core_dim,
>    1275     )
>    1276 # feed DataArray apply_variable_ufunc through apply_dataarray_vfunc
>    1277 elif any(isinstance(a, DataArray) for a in args):
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py:536](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py#line=535), in apply_dataset_vfunc(func, signature, join, dataset_join, fill_value, exclude_dims, keep_attrs, on_missing_core_dim, *args)
>     531 list_of_coords, list_of_indexes = build_output_coords_and_indexes(
>     532     args, signature, exclude_dims, combine_attrs=keep_attrs
>     533 )
>     534 args = tuple(getattr(arg, "data_vars", arg) for arg in args)
> --> 536 result_vars = apply_dict_of_variables_vfunc(
>     537     func,
>     538     *args,
>     539     signature=signature,
>     540     join=dataset_join,
>     541     fill_value=fill_value,
>     542     on_missing_core_dim=on_missing_core_dim,
>     543 )
>     545 out: Dataset | tuple[Dataset, ...]
>     546 if signature.num_outputs > 1:
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py:460](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py#line=459), in apply_dict_of_variables_vfunc(func, signature, join, fill_value, on_missing_core_dim, *args)
>     458 core_dim_present = _check_core_dims(signature, variable_args, name)
>     459 if core_dim_present is True:
> --> 460     result_vars[name] = func(*variable_args)
>     461 else:
>     462     if on_missing_core_dim == "raise":
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py:742](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py#line=741), in apply_variable_ufunc(func, signature, exclude_dims, dask, output_dtypes, vectorize, keep_attrs, dask_gufunc_kwargs, *args)
>     735 broadcast_dims = tuple(
>     736     dim for dim in dim_sizes if dim not in signature.all_core_dims
>     737 )
>     738 output_dims = [broadcast_dims + out for out in signature.output_core_dims]
>     740 input_data = [
>     741     (
> --> 742         broadcast_compat_data(arg, broadcast_dims, core_dims)
>     743         if isinstance(arg, Variable)
>     744         else arg
>     745     )
>     746     for arg, core_dims in zip(args, signature.input_core_dims, strict=True)
>     747 ]
>     749 if any(is_chunked_array(array) for array in input_data):
>     750     if dask == "forbidden":
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py:663](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/computation.py#line=662), in broadcast_compat_data(variable, broadcast_dims, core_dims)
>     658 def broadcast_compat_data(
>     659     variable: Variable,
>     660     broadcast_dims: tuple[Hashable, ...],
>     661     core_dims: tuple[Hashable, ...],
>     662 ) -> Any:
> --> 663     data = variable.data
>     665     old_dims = variable.dims
>     666     new_dims = broadcast_dims + core_dims
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/variable.py:451](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/variable.py#line=450), in Variable.data(self)
>     449     return self._data
>     450 elif isinstance(self._data, indexing.ExplicitlyIndexed):
> --> 451     return self._data.get_duck_array()
>     452 else:
>     453     return self.values
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:837](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py#line=836), in MemoryCachedArray.get_duck_array(self)
>     836 def get_duck_array(self):
> --> 837     self._ensure_cached()
>     838     return self.array.get_duck_array()
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:831](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py#line=830), in MemoryCachedArray._ensure_cached(self)
>     830 def _ensure_cached(self):
> --> 831     self.array = as_indexable(self.array.get_duck_array())
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:788](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py#line=787), in CopyOnWriteArray.get_duck_array(self)
>     787 def get_duck_array(self):
> --> 788     return self.array.get_duck_array()
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:651](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py#line=650), in LazilyIndexedArray.get_duck_array(self)
>     647     array = apply_indexer(self.array, self.key)
>     648 else:
>     649     # If the array is not an ExplicitlyIndexedNDArrayMixin,
>     650     # it may wrap a BackendArray so use its __getitem__
> --> 651     array = self.array[self.key]
>     653 # self.array[self.key] is now a numpy array when
>     654 # self.array is a BackendArray subclass
>     655 # and self.key is BasicIndexer((slice(None, None, None),))
>     656 # so we need the explicit check for ExplicitlyIndexed
>     657 if isinstance(array, ExplicitlyIndexed):
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/backends/h5netcdf_.py:51](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/backends/h5netcdf_.py#line=50), in H5NetCDFArrayWrapper.__getitem__(self, key)
>      50 def __getitem__(self, key):
> ---> 51     return indexing.explicit_indexing_adapter(
>      52         key, self.shape, indexing.IndexingSupport.OUTER_1VECTOR, self._getitem
>      53     )
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:1015](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py#line=1014), in explicit_indexing_adapter(key, shape, indexing_support, raw_indexing_method)
>     993 """Support explicit indexing by delegating to a raw indexing method.
>     994 
>     995 Outer and[/or](http://localhost:8890/or) vectorized indexers are supported by indexing a second time
>    (...)
>    1012 Indexing result, in the form of a duck numpy-array.
>    1013 """
>    1014 raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support)
> -> 1015 result = raw_indexing_method(raw_key.tuple)
>    1016 if numpy_indices.tuple:
>    1017     # index the loaded np.ndarray
>    1018     indexable = NumpyIndexingAdapter(result)
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/backends/h5netcdf_.py:58](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/backends/h5netcdf_.py#line=57), in H5NetCDFArrayWrapper._getitem(self, key)
>      56 with self.datastore.lock:
>      57     array = self.get_array(needs_lock=False)
> ---> 58     return array[key]
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/h5netcdf/core.py:555](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/h5netcdf/core.py#line=554), in BaseVariable.__getitem__(self, key)
>     553     return h5ds[key].view(view)
>     554 else:
> --> 555     return h5ds[key]
> 
> File h5py[/_objects.pyx:54](http://localhost:8890/_objects.pyx#line=53), in h5py._objects.with_phil.wrapper()
> 
> File h5py[/_objects.pyx:55](http://localhost:8890/_objects.pyx#line=54), in h5py._objects.with_phil.wrapper()
> 
> File [/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/h5py/_hl/dataset.py:758](http://localhost:8890/usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/h5py/_hl/dataset.py#line=757), in Dataset.__getitem__(self, args, new_dtype)
>     756 if self._fast_read_ok and (new_dtype is None):
>     757     try:
> --> 758         return self._fast_reader.read(args)
>     759     except TypeError:
>     760         pass  # Fall back to Python read pathway below
> 
> File h5py[/_selector.pyx:376](http://localhost:8890/_selector.pyx#line=375), in h5py._selector.Reader.read()
> 
> OSError: [Errno 14] Can't synchronously read data (file read failed: time = Wed Oct  9 23:26:41 2024
> , filename = '[/media/sf_F_DRIVE/try1/201601.nc](http://localhost:8890/lab/tree/201601.nc)', file descriptor = 59, errno = 14, error message = 'Bad address', buf = 0x55cb69267410, total read size = 24641280, bytes this sub-read = 24641280, bytes actually read = 18446744073709551615, offset = 0)

(2)I don't get this error when I try to replace the data with another smaller one(just 2 days, small_data.zip)to run my script . The above trial data is a dataset for one month. So it seems to me that this error has nothing to do with the engine parameter in the open_dataset function, but has to do with the size of the data volume.

Main package information:

xesmf 0.8.7 xarray 2024.9.0 numpy 2.0.2 netCDF4 1.7.1 h5netcdf 1.4.0 python 3.12.7

I've been stuck with this question for days, Could you give some advice。

aulemahal commented 4 weeks ago

I'm really sorry, I just created an environment with the versions you gave in the precedent comment, and it passes. I still can't reproduce the issue with the code and file given in the top comment.

The errors you have look like the file is corrupted. If you simply do ds.load() before any xesmf calls, does the error also happen ?

As for the size, I wouldn't think a memory problem you give this error. And running only your snippet on the test data on my machine used a maximum of 600 Mo of RAM. I would be surprised that this is too much for your machine.

onion5376 commented 4 weeks ago

Thanks@aulemahal. (1) The test script: import matplotlib.pyplot as plt import cartopy.crs as ccrs import numpy as np import xarray as xr import xesmf as xe ds = xr.open_dataset("chla201601.nc") ds.load()


RuntimeError Traceback (most recent call last) Cell In[5], line 1 ----> 1 ds.load()

File /usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/dataset.py:880, in Dataset.load(self, **kwargs) 878 for k, v in self.variables.items(): 879 if k not in lazy_data: --> 880 v.load() 882 return self

File /usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/variable.py:981, in Variable.load(self, kwargs) 964 def load(self, kwargs): 965 """Manually trigger loading of this variable's data from disk or a 966 remote source into memory and return this variable. 967 (...) 979 dask.array.compute 980 """ --> 981 self._data = to_duck_array(self._data, **kwargs) 982 return self

File /usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/namedarray/pycompat.py:134, in to_duck_array(data, **kwargs) 131 return loaded_data 133 if isinstance(data, ExplicitlyIndexed): --> 134 return data.get_duck_array() # type: ignore[no-untyped-call, no-any-return] 135 elif is_duck_array(data): 136 return data

File /usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:837, in MemoryCachedArray.get_duck_array(self) 836 def get_duck_array(self): --> 837 self._ensure_cached() 838 return self.array.get_duck_array()

File /usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:831, in MemoryCachedArray._ensure_cached(self) 830 def _ensure_cached(self): --> 831 self.array = as_indexable(self.array.get_duck_array())

File /usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:788, in CopyOnWriteArray.get_duck_array(self) 787 def get_duck_array(self): --> 788 return self.array.get_duck_array()

File /usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:651, in LazilyIndexedArray.get_duck_array(self) 647 array = apply_indexer(self.array, self.key) 648 else: 649 # If the array is not an ExplicitlyIndexedNDArrayMixin, 650 # it may wrap a BackendArray so use its getitem --> 651 array = self.array[self.key] 653 # self.array[self.key] is now a numpy array when 654 # self.array is a BackendArray subclass 655 # and self.key is BasicIndexer((slice(None, None, None),)) 656 # so we need the explicit check for ExplicitlyIndexed 657 if isinstance(array, ExplicitlyIndexed):

File /usr/miniforge3/envs/xesmfenv/lib/python3.12/site-packages/xarray/backends/netCDF4.py:100, in NetCDF4ArrayWrapper.getitem(self, key) 99 def getitem(self, key): --> 100 return indexing.explicit_indexing_adapter( 101 key, self.shape, indexing.IndexingSupport.OUTER, self._getitem 102 )

File /usr/miniforge3/envs/xesmf_env/lib/python3.12/site-packages/xarray/core/indexing.py:1015, in explicit_indexing_adapter(key, shape, indexing_support, raw_indexing_method) 993 """Support explicit indexing by delegating to a raw indexing method. 994 995 Outer and/or vectorized indexers are supported by indexing a second time (...) 1012 Indexing result, in the form of a duck numpy-array. 1013 """ 1014 raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support) -> 1015 result = raw_indexing_method(raw_key.tuple) 1016 if numpy_indices.tuple: 1017 # index the loaded np.ndarray 1018 indexable = NumpyIndexingAdapter(result)

File /usr/miniforge3/envs/xesmfenv/lib/python3.12/site-packages/xarray/backends/netCDF4.py:113, in NetCDF4ArrayWrapper._getitem(self, key) 111 with self.datastore.lock: 112 original_array = self.get_array(needs_lock=False) --> 113 array = getitem(original_array, key) 114 except IndexError: 115 # Catch IndexError in netCDF4 and return a more informative 116 # error message. This is most often called when an unsorted 117 # indexer is used before the data is loaded from disk. 118 msg = ( 119 "The indexing operation you are attempting to perform " 120 "is not valid on netCDF4.Variable object. Try loading " 121 "your data into memory first by calling .load()." 122 )

File src/netCDF4/_netCDF4.pyx:4981, in netCDF4._netCDF4.Variable.getitem()

File src/netCDF4/_netCDF4.pyx:5953, in netCDF4._netCDF4.Variable._get()

File src/netCDF4/_netCDF4.pyx:2113, in netCDF4._netCDF4._ensure_nc_success()

RuntimeError: NetCDF: HDF error

(2) The memory of the virtual system should be sufficient for this case data processing, the ram information was shown here:

(xesmf_env) [root@localhost onion5376]# free -h total used free shared buff/cache available Mem: 8.3Gi 5.0Gi 2.0Gi 44Mi 1.7Gi 3.3Gi Swap: 3.9Gi 0B 3.9Gi

(3) Regarding whether the file is corrupted or not, I did a simple plot of this data. However everything is fine for all variables (CHL, CHL_uncertainty and flags). The code:

import xarray as xr
import matplotlib.pyplot as plt
ds = xr.open_dataset("chla201601.nc")
g = xr.plot.FacetGrid(ds, col='time', col_wrap=3)
g.map(plt.pcolormesh, 'longitude', 'latitude', 'CHL',vmin=0.000001,vmax=8)
plt.tight_layout()
plt.show()

example_plot

aulemahal commented 4 weeks ago

If ds.load() crashes, but not the plot of CHL, I'm pretty sure that means the corrupted or unreadable part is elsewhere. Do you need to keep CHL_uncertainty and flags ?

Try to change the last line of the regridding script to this to regrid only CHL and forget about the other two variables.

dr_out = Regrd(ds[['CHL']])  # Regrid only CHL

Also, I think your last comment clearly shows that the error does not come from xESMF. I'll close this issue for now. I suggest you contact people that provided you with this dataset and show them the error when you do ds.load(), maybe including your package list (pip list in a terminal).

Finally, simply for your information : you can use github text formatting to make code show as monospaced text. For python code, for example you can put this on the line before your code : ```python3

and the same : ```, on the line after.

onion5376 commented 4 weeks ago

Thanks aulemahal. I have tested regridding scrip for only CHL, it gets the same errors. I'll think of some more solutions.

aulemahal commented 4 weeks ago

That's weird! I don't get how you can plot the entire variable, but not load it... Good luck!