lkilcher / dolfyn

A library for oceanographic doppler instruments such as Acoustic Doppler Profilers (ADPs, ADCPs) and Acoustic Doppler Velocimeters (ADVs).
BSD 3-Clause "New" or "Revised" License
41 stars 25 forks source link

Trouble reading data subset #122

Closed hevgyrt closed 7 months ago

hevgyrt commented 8 months ago

Hi,

First of all, dolfyn is a great initiative. I have a 32GB Signature 500 .ad2cp file, and would like to do the data subsetting in the dolfyn.read function by using the nens argument with a tuple, or list. The acquisition was concurrent Burst and waves + the echosounder option enabled.

My issues are that

  1. If I provide start values larger than 0, I get
    
    ---------------------------------------------------------------------------
    IndexError                                Traceback (most recent call last)
    Cell In[70], line 1
    ----> 1 dat = dolfyn.read('myfile.ad2cp',nens=[3,150])

File ~/miniconda3/envs/ekok/li/python3.9/site-packages/dolfyn/io/api.py:103, in read(fname, userdata, nens, kwargs) 99 func_map = dict(RDI=read_rdi, 100 nortek=read_nortek, 101 signature=read_signature) 102 func = func_map[file_type] --> 103 return func(fname, userdata=userdata, nens=nens, kwargs)

File ~/miniconda3/envs/ekok/lib/python3.9/site-packages/dolfyn/io/nortek2.py:69, in read_signature(filename, userdata, nens, rebuild_index, debug, **kwargs) 67 d = rdr.readfile(nens[0], nens[1]) 68 rdr.sci_data(d) ---> 69 out = _reorg(d) 70 _reduce(out) 72 # Convert time to dt64 and fill gaps

File ~/miniconda3/envs/ekok/lib/python3.9/site-packages/dolfyn/io/nortek2.py:413, in _reorg(dat) 410 outdat['long_name'].update(dnow['long_name']) 411 outdat['standard_name'].update(dnow['standard_name']) 412 cfg['burst_config' + tag] = lib._headconfig_int2dict( --> 413 lib._collapse(dnow['config'], exclude=collapse_exclude, 414 name='config')) 415 outdat['coords']['time' + tag] = lib._calc_time( 416 dnow['year'] + 1900, 417 dnow['month'], (...) 421 dnow['second'], 422 dnow['usec100'].astype('uint32') * 100) 423 tmp = lib._beams_cy_int2dict( 424 lib._collapse(dnow['beam_config'], exclude=collapse_exclude, 425 name='beam_config'), 21)

File ~/miniconda3/envs/ekok/lib/python3.9/site-packages/dolfyn/io/nortek2_lib.py:429, in _collapse(vec, name, exclude) 425 def _collapse(vec, name=None, exclude=[]): 426 """Check that the input vector is uniform, then collapse it to a 427 single value, otherwise raise a warning. 428 """ --> 429 if _isuniform(vec): 430 return vec[0] 431 elif _isuniform(vec, exclude=exclude):

File ~/miniconda3/envs/ekok/lib/python3.9/site-packages/dolfyn/io/nortek2_lib.py:422, in _isuniform(vec, exclude) 420 if len(exclude): 421 return len(set(np.unique(vec)) - set(exclude)) <= 1 --> 422 return np.all(vec == vec[0])

IndexError: index 0 is out of bounds for axis 0 with size 0


2. There seem to be an upper bound on the `stop` value, which is much lower than the number of points that I have in my dataset.

IndexError Traceback (most recent call last) Cell In[72], line 1 ----> 1 dat = dolfyn.read('myfile.ad2cp',nens=[0,15000])

File ~/miniconda3/envs/ekok/lib/python3.9/site-packages/dolfyn/io/api.py:103, in read(fname, userdata, nens, kwargs) 99 func_map = dict(RDI=read_rdi, 100 nortek=read_nortek, 101 signature=read_signature) 102 func = func_map[file_type] --> 103 return func(fname, userdata=userdata, nens=nens, kwargs)

File ~/miniconda3/envs/ekok/lib/python3.9/site-packages/dolfyn/io/nortek2.py:67, in read_signature(filename, userdata, nens, rebuild_index, debug, **kwargs) 64 userdata = _find_userdata(filename, userdata) 66 rdr = _Ad2cpReader(filename, rebuild_index=rebuild_index, debug=debug) ---> 67 d = rdr.readfile(nens[0], nens[1]) 68 rdr.sci_data(d) 69 out = _reorg(d)

File ~/miniconda3/envs/ekok/lib/python3.9/site-packages/dolfyn/io/nortek2.py:311, in _Ad2cpReader.readfile(self, ens_start, ens_stop) 307 if sz != rdr._N[tmp_idx]: 308 raise Exception( 309 "The number of samples in this 'Altimeter Raw' " 310 "burst is different from prior bursts.") --> 311 self._read_burst(id, outdat[id], c26) 312 outdat[id]['ensemble'][c26] = c 313 c26 += 1

File ~/miniconda3/envs/ekok/lib/python3.9/site-packages/dolfyn/io/nortek2.py:251, in _Ad2cpReader._read_burst(self, id, dat, c, echo) 249 def _read_burst(self, id, dat, c, echo=False): 250 rdr = self._burst_readers[id] --> 251 rdr.read_into(self.f, dat, c)

File ~/miniconda3/envs/ekok/lib/python3.9/site-packages/dolfyn/io/nortek2_defs.py:79, in _DataDef.read_into(self, fobj, data, ens, cs) 77 for nm, shp, d in zip(self._names, self._shape, dat_tuple): 78 try: ---> 79 data[nm][..., ens] = d 80 except ValueError: 81 data[nm][..., ens] = np.asarray(d).reshape(shp)

IndexError: index 11 is out of bounds for axis 0 with size 11

jmcvey3 commented 8 months ago

Hi @hevgyrt, thanks for bringing this up. If you can copy this issue over to https://github.com/MHKiT-Software/MHKiT-Python/issues, that would be great. DOLfYN has been more or less subsumed from this repo into MHKiT for future support, and I need to update this README to reflect that.

On this issue, looks like the tests we're running to check the "nens" argument all ran single profiles. Found one AST dataset that returns the first error, but not yet able to replicate the second - I'm going to guess that they're related. We haven't had much (of any) data from ADCPs running dual-profiles, and the way Nortek codes these into the .ad2cp format is a little more difficult to parse.

hevgyrt commented 8 months ago

Thanks for your reply, @jmcvey3. I have now reposted the issue on the git repo that you refer to.

jmcvey3 commented 7 months ago

Hi @hevgyrt, see if you can pull PR #125 and let me know how that works. It should solve problem 1 listed above, but I wasn't able to replicate problem 2.

If you still get that 2nd error, can you send me the .ad2cp.index file that dolfyn generates for this file?

hevgyrt commented 7 months ago

Great @jmcvey3 . This seems to have resolved the problem. (Sorry for the late reply. Been sick in the meantime).

jmcvey3 commented 7 months ago

@hevgyrt Brilliant, I'll update the tests and submit the PR. No worries, hope you get better soon