lkilcher / dolfyn

A library for oceanographic doppler instruments such as Acoustic Doppler Profilers (ADPs, ADCPs) and Acoustic Doppler Velocimeters (ADVs).
BSD 3-Clause "New" or "Revised" License
42 stars 25 forks source link

`subset` doesn't work for arbitrary data additions #59

Closed lkilcher closed 2 years ago

lkilcher commented 5 years ago

If you add a data-item that doesn't have a time-dimension that matches the other time dimensions, then subset won't work.

For now, a workaround is to pop the data item from the data object, then perform subset, then add it back:

z_ = dat.pop('z_')
dat2 = dat.subset[1000:5000]
dat2.z_ = z_
dat.z_ = z_

This is linked to the problem that we do not yet track the array dimensions. It looks like xarray has the functionality we need, perhaps we should switch to using that?

lkilcher commented 5 years ago

I just updated pyDictH5 so that the error message that is generated when this doesn't work is more informative. Update pyDictH5 to see these messages:

pip install git+https://github.com/lkilcher/pyDictH5
mcfogarty commented 5 years ago

The workaround works!
Including an example from my dataset as an FYI.

Pop the added data items from datETU

eta = datETU.pop('eta')  # Time series. Need to pop.
# mllw = datETU.pop('mllw') # Single value. No need to pop.
z_ = datETU.pop('z_') # One value per ADCP range bin. Need to pop.

# Subset the remaining datETU to crop time series from 100 days to 7 days.
trange = [start_time, end_time]
tinds = (trange[0] <= datETU.mpltime) & (datETU.mpltime <= trange[-1])
datETUsmb = datETU.subset[tinds]

# Put the data items back into both datETU and datETUsmb
datETU.eta = eta 
datETU.z_= z_

datETUsmb.eta = eta[tinds] 
datETUsmb.z_= z_ 
lkilcher commented 2 years ago

This issue is no longer relevant with the move to xarray as the dataset backend.