OSOceanAcoustics / echopype

Enabling interoperability and scalability in ocean sonar data analysis
https://echopype.readthedocs.io/
Apache License 2.0
99 stars 76 forks source link

Problematic EK80 Environment Datagrams for Replayed Files and Large Files #1386

Open ctuguinay opened 2 months ago

ctuguinay commented 2 months ago

From conversation with @leewujung,

Currently, the Echopype parsing code in parse_base.py expects the EK80 Environment datagram to be unique, i.e., we only get one XML datagram with subtype environment. However, when parsing replayed files and large files (~2GiB) with EK80 application_version='23.6.2.0' and file_format_version='1.32' (the latest versions), we encounter multiple XML datagrams with subtype environment where only the first datagram contains all the environment variables we need for calibration (temp, salinity, pressure, etc.) and the latter datagrams contain only drop_keel_offset and drop_keel_offset_is_manual. The latter datagrams override the first datagram and we end up with an EchoData object that does not contain the environment parameters we need for calibration.

As a temporary fix to this, we will drop XML datagrams with subtype environment where the only data available is drop_keel_offset and drop_keel_offset_is_manual.

@leewjung update: A more thorough fix is to take in all environment datagrams and set the non-existing variables to NaN, and then ensure the downstream compute_Sv function can handle these variations.