coecms / xmhw

Xarray version of Marine Heatwaves code by Eric Olivier
https://xmhw.readthedocs.io/en/latest/
Apache License 2.0
22 stars 10 forks source link

land_check - dropna #57

Closed florianboergel closed 1 year ago

florianboergel commented 1 year ago

I am experiencing a problem with the function land_check. However, this is limited to combined usage with dask.

land_check throws the following error related to the .dropna() call, ultimately causing an error in reshaping the array:

      1 with dask.config.set(**{'array.slicing.split_large_chunks': True}):
----> 2     ts = land_check(temp, tdim=tdim, anynans=anynans)

File /silos/conda_packages/boergel/miniconda3_4.12.0/OS_15.4/conda_env/dask/lib/python3.11/site-packages/xmhw/identify.py:523, in land_check(temp, tdim, anynans)
    521 if anynans:
    522     how = "any"
--> 523 ts = ts.dropna(dim="cell", how=how)
    524 # if ts.cell.shape is 0 then all points are land, quit
    525 if ts.cell.shape == (0,):

File /silos/conda_packages/boergel/miniconda3_4.12.0/OS_15.4/conda_env/dask/lib/python3.11/site-packages/xarray/core/dataarray.py:3232, in DataArray.dropna(self, dim, how, thresh)
   3159 def dropna(
   3160     self: T_DataArray,
   3161     dim: Hashable,
   3162     how: Literal["any", "all"] = "any",
   3163     thresh: int | None = None,
   3164 ) -> T_DataArray:
   3165     """Returns a new array with dropped labels for missing values along
   3166     the provided dimension.
   3167 
   (...)
   3230     Dimensions without coordinates: Y, X
...
File /silos/conda_packages/boergel/miniconda3_4.12.0/OS_15.4/conda_env/dask/lib/python3.11/site-packages/dask/utils.py:1104, in __call__()
   1103 def __call__(self, __obj, *args, **kwargs):
-> 1104     return getattr(__obj, self.method)(*args, **kwargs)

ValueError: cannot reshape array of size 11422080 into shape (17847,220) 

My data is of the following type:

sst = temperatureData.drop("depth").TEMP sst = sst.chunk({'time': -1, 'lat': 'auto', 'lon': 'auto'}) sst.data

Array Chunk
15.72 GiB 124.48 MiB
(18210, 362, 320) (18210, 32, 28)

144 chunks in 3 graph layers float64 numpy.ndarray

Any idea?

florianboergel commented 1 year ago

OK, nevermind, avoiding splitting large chunks solves the problem.