hainegroup / oceanspy

A Python package to facilitate ocean model data analysis and visualization.
https://oceanspy.readthedocs.io
MIT License
101 stars 32 forks source link

compute.integral works with LLC4320 but not ECCO #359

Closed ThomasHaine closed 1 year ago

ThomasHaine commented 1 year ago

The following compute.integral works for LLC4320 but errors for ECCO (on SciServer):

import oceanspy as ospy

# OGCM_dataset = 'ECCO'
OGCM_dataset = 'LLC4320'
od = ospy.open_oceandataset.from_catalog(OGCM_dataset)

if OGCM_dataset == 'LLC4320':
# LLC4320 cutout requires this:
    od._ds = od._ds.drop_vars({'k', 'k_u', 'k_p1', 'k_l'})
    co_list = [var for var in od._ds.variables if "time" not in od._ds[var].dims]
    od._ds = od._ds.set_coords(co_list)
else:
    ds = od.dataset
    ds['Temp'] = ds['THETA']
    od = ospy.OceanDataset(ds)

varList = ['Temp']
timeFreq= None
timeRange = ['2012-04-25T00','2012-04-30T02']
p1 = [{"type":"LineString","coordinates":[[-27.550652911045805,68.60723990897543],[-26.783621229652017,67.4686818705328],[-24.284903450280915,67.12069825754523],[-22.93272404606945,66.47294135507451]]}]
from oceanspy.utils import viewer_to_range
lon, lat = viewer_to_range(p1)
od_moor = od.subsample.mooring_array(
    Xmoor=lon,
    Ymoor=lat,
    timeRange=timeRange,
    timeFreq=timeFreq,
    varList=varList,
)
ds_massflux = ospy.compute.integral(od_moor)

The ECCO error is:


TypeError                                 Traceback (most recent call last)
Cell In[1], line 30
     22 lon, lat = viewer_to_range(p1)
     23 od_moor = od.subsample.mooring_array(
     24     Xmoor=lon,
     25     Ymoor=lat,
   (...)
     28     varList=varList,
     29 )
---> 30 ds_massflux = ospy.compute.integral(od_moor)

File ~/mambaforge/envs/Oceanography/lib/python3.9/site-packages/oceanspy/compute.py:850, in integral(od, varNameList, axesList, aliased)
    820 def integral(od, varNameList=None, axesList=None, aliased=True):
    821     """
    822     Compute integrals along specified axes (simple discretization).
    823 
   (...)
    847     weighted_mean
    848     """
--> 850     return _integral_and_mean(
    851         od,
    852         varNameList=varNameList,
    853         axesList=axesList,
    854         aliased=aliased,
    855         operation="integral",
    856     )

File ~/mambaforge/envs/Oceanography/lib/python3.9/site-packages/oceanspy/compute.py:1074, in _integral_and_mean(od, operation, varNameList, axesList, aliased, storeWeights)
   1071 dims2ave = dims2sum
   1072 if operation == "integral":
   1073     # Compute integral
-> 1074     Int = (Int * delta).sum(dims2sum)
   1075     to_return["I(" + varNameOUT + ")" + "".join(suf)] = Int
   1076     if "units" in attrs:

File ~/mambaforge/envs/Oceanography/lib/python3.9/site-packages/xarray/core/_typed_ops.py:212, in DataArrayOpsMixin.__mul__(self, other)
    211 def __mul__(self, other):
--> 212     return self._binary_op(other, operator.mul)

File ~/mambaforge/envs/Oceanography/lib/python3.9/site-packages/xarray/core/dataarray.py:4355, in DataArray._binary_op(self, other, f, reflexive)
   4351 other_variable = getattr(other, "variable", other)
   4352 other_coords = getattr(other, "coords", None)
   4354 variable = (
-> 4355     f(self.variable, other_variable)
   4356     if not reflexive
   4357     else f(other_variable, self.variable)
   4358 )
   4359 coords, indexes = self.coords._merge_raw(other_coords, reflexive)
   4360 name = self._result_name(other)

File ~/mambaforge/envs/Oceanography/lib/python3.9/site-packages/xarray/core/_typed_ops.py:402, in VariableOpsMixin.__mul__(self, other)
    401 def __mul__(self, other):
--> 402     return self._binary_op(other, operator.mul)

File ~/mambaforge/envs/Oceanography/lib/python3.9/site-packages/xarray/core/variable.py:2654, in Variable._binary_op(self, other, f, reflexive)
   2651 attrs = self._attrs if keep_attrs else None
   2652 with np.errstate(all="ignore"):
   2653     new_data = (
-> 2654         f(self_data, other_data) if not reflexive else f(other_data, self_data)
   2655     )
   2656 result = Variable(dims, new_data, attrs=attrs)
   2657 return result

TypeError: operand type(s) all returned NotImplemented from __array_ufunc__(<ufunc 'multiply'>, '__call__', dask.array<where, shape=(2, 2), dtype=datetime64[ns], chunksize=(1, 2), chunktype=numpy.ndarray>, array([[2635200.],
       [2635200.]])): 'Array', 'ndarray'```
Mikejmnez commented 1 year ago

Thanks @ThomasHaine I will look into this. Can you share what the p values are, so that I can reproduce the error?

ThomasHaine commented 1 year ago

Sorry, I don't follow. What p values?

Mikejmnez commented 1 year ago

Nevermind! I just saw that these are written in your message (p-values are lats and lons used to compute arrays).

p1 = [{"type":"LineString","coordinates":[[-27.550652911045805,68.60723990897543],[-26.783621229652017,67.4686818705328],[-24.284903450280915,67.12069825754523],[-22.93272404606945,66.47294135507451]]}]

Here is a plot of the resulting mooring array superimposed over the model's Depth from the resulting cutout. Definitely a bug. The trajectory of the mooring array revisits certain coordinates (even though there are no repeated coords since I am removing any repeated ones).

bug_array

I am not 100% sure the error is associated with the weird trajectory of the array. But it certainly doesn't help. I will take a look into this starting from here.

Mikejmnez commented 1 year ago

Sorry for the late reply (I had to go to DC kind of unexpectedly). I figured out what is causing the error:

In the ECCO dataset, the variable time_bnds with dimensions (time:312, nv:2) causes the error above when computing mass flux i.e. the integration along all the dataset's dimensions (time, Z, mooring, X, Y, Xp1, Yp1).

Solution:

Removing the variable before the integration solves this issue. For example:


# OGCM_dataset = 'ECCO'
OGCM_dataset = 'ECCO'
od = ospy.open_oceandataset.from_catalog(OGCM_dataset)

if OGCM_dataset == 'LLC4320':
# LLC4320 cutout requires this:
    od._ds = od._ds.drop_vars({'k', 'k_u', 'k_p1', 'k_l'})
    co_list = [var for var in od._ds.variables if "time" not in od._ds[var].dims]
    od._ds = od._ds.set_coords(co_list)
else:
    od._ds = od._ds.drop_vars({'k', 'k_u', 'k_p1', 'k_l', 'time_bnds'})
    ds = od.dataset
    ds['Temp'] = ds['THETA']
    od = ospy.OceanDataset(ds)

The variable is not present in the LLC4320 dataset, and so there is no error there.

Nonetheless, I worked out a way to solve the weirdness of the array above. I will run some extra checks tomorrow to make sure it doesn't happen again.

ThomasHaine commented 1 year ago

Thanks @Mikejmnez. It now works without an error. Please can you share the code to make the figure of the mooring path above. I'll make some more checks.

Mikejmnez commented 1 year ago

closed by #361

Mikejmnez commented 1 year ago

There are two ways to make the horizontal maps of depth with mooring array. Which one you use depends on the particulars of the cutout and or mooring array... This is a consequence of the complex topology and the removal of face as dimension during cutout.

I ran the following snippets in Sciserver. Took about 1-2 minutes to complete.

  1. Just like in the oceanspy documentation (see Kogur).
    
    # import packages
    import oceanspy as ospy
    import xarray as xr
    import numpy as np
    import matplotlib.pyplot as plt
    plt.rcParams['figure.figsize'] = 12, 8

from dask.distributed import Client client = Client() client

load monthly ECCO dataset

od = ospy.open_oceandataset.from_catalog('ECCO') od._ds = od._ds.drop_vars({'k', 'k_u', 'k_p1', 'k_l', 'time_bnds', 'ETAN_snap'}) co_list = [var for var in od._ds.variables if "time" not in od._ds[var].dims] od._ds = od._ds.set_coords(co_list)

create/load mooring array from Poseidon viewer

p1 = [{"type":"LineString","coordinates":[[-27.550652911045805,68.60723990897543],[-26.783621229652017,67.4686818705328],[-24.284903450280915,67.12069825754523],[-22.93272404606945,66.47294135507451]]}] lon, lat = ospy.utils.viewer_to_range(p1)

cutout first, just for plotting purposes

args = { "XRange": lon, "YRange": lat, } cut_od = od.subsample.cutout(**args)

now compute (from original od) mooring array

od_moor = od.subsample.mooring_array(Xmoor=lon, Ymoor=lat) Xmoor = od_moor.dataset['XC'].squeeze().values Ymoor = od_moor.dataset['YC'].squeeze().values

FINALLY plotting!!!

fig = plt.figure(figsize=(12, 5)) ax = cut_od.plot.horizontal_section(varName="Depth", cmap='Greys_r') ax.plot(lon, lat, 'b') # original data line = ax.plot(Xmoor, Ymoor, "r.", ls='--'); # mooring


The above works well if:
a) there are no `NANs` in the coordinates (`XC` and `YC`). Nans can happen in the LLC dataset after cutout transformation due to complex topology. 
b) Mooring array does not involve jump singularities in the longitude, like it can happen in the Pacific Ocean.

2. In the case of any of a) or b) above, you can still visualize the data by plotting in INDEX space, since in the transformed dataset the axes `X, Y` are roughly aligned with `lon` and `lat` respectively. So for the same dataset and mooring array, you can plot instead as:

```python
fig = plt.figure(figsize=(10, 5))
plt.contourf(cut_od._ds['Depth'], levels = np.linspace(0, 5000, 250), cmap='Greys_r')
plt.plot(od_moor.dataset['Xind'].squeeze().values, od_moor.dataset['Yind'].squeeze().values, 'r-.', marker='o', markersize=5)
plt.show()