OSOceanAcoustics / echopype

Enabling interoperability and scalability in ocean sonar data analysis
https://echopype.readthedocs.io/
Apache License 2.0
95 stars 73 forks source link

Remove `zarr_combine.py` and duplicate/old ping_time machinery #1122

Closed emiliom closed 1 year ago

emiliom commented 1 year ago

PR #1042 made the use of ZarrCombine obsolete. However, zarr_combine.py and associated code were not removed. As they will no longer be used, they should be removed.

One feature found in zarr_combine.py that nominally would have generic value is the identification of "duplicate" ping_time values and the creation of the associated old_ping_time variable as a Provenance variable:

https://github.com/OSOceanAcoustics/echopype/blob/6b734eb86958c036919216719a8b3c6e1b41c0a1/echopype/convert/set_groups_ek60.py#L84-L129

However, we have found that what we have been identifying as duplicates, found mainly (solely?) in OOI EK60 data, is in fact an artifact of the use of "ms" (millisecond) np.datetime64 unit resolution. As we switch to "ns" (nanosecond) (PR #1117), ping_time values are no longer duplicates.

See https://github.com/OSOceanAcoustics/echopype/pull/1117#issuecomment-1676491081 for a longer discussion.

One could envision a generic use for functions to identify and handle actual duplicate ping_time values. However, we're not aware of such a case once we switch to "ns" resolution. It seems best to remove the current functionality altogether at the same time as we remove the obsolete use of ZarrCombine. Note that as currently implemented, it's limited to EK60, so it's not completely generic.

leewujung commented 1 year ago

This is addressed in #1117 .