MHKiT-Software / MHKiT-Python

MHKiT-Python provides the marine renewable energy (MRE) community tools for data processing, visualization, quality control, resource assessment, and device performance.
https://mhkit-software.github.io/MHKiT/
BSD 3-Clause "New" or "Revised" License
47 stars 45 forks source link

(Dolfyn) Non-identical beam error on multiple Nortek Signature 500 .ad2cp files? #287

Closed willcoxe closed 4 months ago

willcoxe commented 5 months ago

Hello, Thanks for creating the package!

I just received several Nortek Signature 500 ADCP files to process and each of them gives the same error (details below). I have updated to the most recent version and tried to call the command in several ways. I have tested the example signature files in the examples directory and they can be read without error. I am hoping you might have some advice on troubleshooting this issue.

Commands I have tried (I have replaced file and directory names):

>>> ds = api.read("FILE.ad2cp")
>>> ds = dolfyn.read("FILE.ad2cp")
>>> ds = dolfyn.io.nortek2.read_signature("FILE.ad2cp")
>>> dolfyn.io.nortek2.read_signature("FILE.ad2cp", rebuild_index=True)

While running it skips a lot of pings e.g.

warnings.warn("Skipped ping (ID: {}) in file {} at ensemble {}."
venvdir/site-packages/mhkit/dolfyn/io/nortek2_lib.py:176: UserWarning: Skipped ping (ID: 31) in file FILE.ad2cp at ensemble 64257.

The message it errors out with is always the same:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "venvdir/site-packages/mhkit/dolfyn/io/api.py", line 105, in read
    return func(fname, userdata=userdata, nens=nens, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "venvdir/site-packages/mhkit/dolfyn/io/nortek2.py", line 68, in read_signature
    rdr = _Ad2cpReader(filename, rebuild_index=rebuild_index, debug=debug)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "venvdir/site-packages/mhkit/dolfyn/io/nortek2.py", line 136, in __init__
    self._config = lib._calc_config(self._index)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "venvdir/site-packages/mhkit/dolfyn/io/nortek2_lib.py", line 493, in _calc_config
    raise Exception("beams_cy are not identical for id: 0x{:X}."
Exception: beams_cy are not identical for id: 0x17.
ssolson commented 5 months ago

Hey @willcoxe thanks for your interest in MHKiT and letting us know about this issue you are having.

Looking through your errors I don't see anything obvious to try to me.

I need @jmcvey3 to take a look at this, however he is doing field work this week so his availability is limited.

It may or may not need it but if you could create a small example subset of your data that recreates the error that you could share that could be helpful.

willcoxe commented 5 months ago

Thanks for the quick response! I appreciate it. I am not entirely sure how to create a subset of a raw ad2cp file without being able to read the file.

jmcvey3 commented 4 months ago

Hi @willcoxe,

Do you have the configuration file for these and can you give some background?

ID 31 is listed as "avg altimeter raw record" in the Nortek documentation; this isn't an ID we've seen before, so therefore dolfyn doesn't know how to read it. It shouldn't cause dolfyn to error out.

The "beams_cy" error literally means that dolfyn isn't seeing consistent entries for ID 0x17 = 23, which is bottom track. The fact this runs through 64,257 ensembles makes me think that the instrument was cut prematurely while taking a bottom track measurement. Can you try running ds = dolfyn.read("FILE.ad2cp", nens=[0, 64200], rebuild_index=True) and see if that runs cleanly?

willcoxe commented 4 months ago

Ah, perhaps it's because it's not bottom track but ice track towards the surface (afaik)?

Sadly I get the exact same error when running ds = dolfyn.read("FILE.ad2cp", nens=[0, 64200], rebuild_index=True)

I pasted the config file (w. changed filename) here

EDIT: Regarding background I am not sure what to say. The data is for ~ 1 year relatively near-surface mooring deployed in a seasonally ice covered region. I think the bottom track is used to determine ice cover/thickness, but I am new to these data so am still reading up on the details.

jmcvey3 commented 4 months ago

Gotcha, yes, Nortek discovered that if they orient bottom track towards the surface, they could track the movement of ice above. It's effectively the same thing to the instrument.

Hmm, I'm not sure about this one. Can you share a smaller datafile or a clip of one? (the linux "truncate" command works well for that. 1 Mb is all that's necessary)

willcoxe commented 4 months ago

I have uploaded the 1 Mb truncated file (truncate -s 1M) file here

I have set it to be removed after download since the data is not supposed to be publicly available yet.

(Please let me know if it doesn't work, not used Lufi before for temp file uploads)

jmcvey3 commented 4 months ago

@willcoxe That worked, thank you. Will post updates as I make progress.

jmcvey3 commented 4 months ago

Hi @willcoxe, see if you can load PR #289 and run your data on it using ds, ds2 = mhkit.dolfyn.read(<file.ad2cp>, dual_profile=True). This will work for the 1 Mb file you sent if you open it in notepad++ and delete the very last line in the file (line 7768, it was truncated off). Also fyi as stated earlier, ice track isn't recorded differently than bottom track, so the ice tracking data is saved with the "_bt" tag and metadata.

Because this is a dual profile setup, I ended up separating the two profiling configurations into their own datasets for multiple reasons. One of which is that you'll at least be able to use dolfyn's rotate tools (setting magnetic declination, rotating between coordinate frames) on both datasets. I also had to bypass the logic that detects skipped pings (these ADCPs sometimes skip a ping or two over a long deployment period), but will see if I can update that for dual profiling.

jmcvey3 commented 4 months ago

Also be able to read ID 31 now, which looks like is the raw altimeter burst for the averaging profile.

willcoxe commented 4 months ago

Hi! Thanks for all the work you are putting into this, I appreciate it.

It does indeed work for the 1 M file. It works precisely up to the file size truncated to 2989 K (with or without the final line), so something must change at that part of the file (same error except for '.format(id)'). If I truncate the file to 1K smaller (2988 K) it works perfectly.

I am not familiar enough with how to set up the structs in python to provide more detail of any change in the file. I added the failing file here.

>>> ds, ds2 = mhkit.dolfyn.read(fn, dual_profile=True, reindex=True)

Indexing FILE.ad2cp... Done.
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "TLD/MHKiT-Python/mhkit/dolfyn/io/api.py", line 114, in read
    return func(fname, userdata=userdata, nens=nens, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "TLD/MHKiT-Python/mhkit/dolfyn/io/nortek2.py", line 79, in read_signature
    rdr = _Ad2cpReader(
          ^^^^^^^^^^^^^
  File "TLD/MHKiT-Python/mhkit/dolfyn/io/nortek2.py", line 166, in __init__
    self._config = lib._calc_config(self._index)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "TLD/MHKiT-Python/mhkit/dolfyn/io/nortek2_lib.py", line 571, in _calc_config
    raise Exception("beams_cy are not identical for id: 0x{:X}.".format(id))
Exception: beams_cy are not identical for id: 0x17.

(original file seizes I received for the mooring(s) vary between 276 M & 1.1 G)

jmcvey3 commented 4 months ago

Got it, try the newest commit I just pushed. The "beams_cy" variable records the number of cells, beams, and the coordinate system the ping was measured with/in, and the number of cells changes for some reason in the bottom track ping record. The number of cells doesn't matter for bottom track, so I added a check for that specifically.

willcoxe commented 4 months ago

This is absolutely fantastic. It reads even the 1.1 G files without a problem now. This is great. Thanks so much!

jmcvey3 commented 4 months ago

Happy to help, glad it works!

willcoxe commented 3 months ago

Small note, not necessarily super important, so not big enough to open entirely new issue (imo)

The current code creates the fields 'vel' for the first (in my case burst) of the dual profiles and 'vel_avg' for the second (in my case averaged) profile. This means that for the averaged=second profile the fields u, v, w, U, ..., etc are currently left undefined.

It's an easy enough thing, I'll prob just (re-)use your code & just modify the field name to field_avg e.g. in velocity.py

 def u(self,):
        return self.ds['vel_avg'][0].drop('dir')

Like I said, this is not an 'issue' or 'bug' per se, just an FYI.

(not 100% on the github-etiquette for putting a small note on a closed comment thread, I won't re-open the issue)