OSOceanAcoustics / echopype

Enabling interoperability and scalability in ocean sonar data analysis
https://echopype.readthedocs.io/
Apache License 2.0
89 stars 70 forks source link

Fix time encoding to use int64 and "nanoseconds since 1970-01-01 00:00:00Z" #1299

Closed ctuguinay closed 2 months ago

ctuguinay commented 2 months ago

This PR addresses the not faithful time serialization warning from issue #1290.

codecov-commenter commented 2 months ago

Codecov Report

Attention: Patch coverage is 90.00000% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 74.57%. Comparing base (9f56124) to head (900ae7d). Report is 44 commits behind head on main.

Files Patch % Lines
echopype/utils/coding.py 80.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1299 +/- ## ========================================== - Coverage 83.52% 74.57% -8.95% ========================================== Files 64 23 -41 Lines 5686 3182 -2504 ========================================== - Hits 4749 2373 -2376 + Misses 937 809 -128 ``` | [Flag](https://app.codecov.io/gh/OSOceanAcoustics/echopype/pull/1299/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=OSOceanAcoustics) | Coverage Δ | | |---|---|---| | [unittests](https://app.codecov.io/gh/OSOceanAcoustics/echopype/pull/1299/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=OSOceanAcoustics) | `74.57% <90.00%> (-8.95%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=OSOceanAcoustics#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

ctuguinay commented 2 months ago

Explanation of the current warning (from slack conversation)

@leewujung

Here's the current order of events in the code that is causing this warning: (https://github.com/pydata/xarray/blob/1d43672574332615f225089d69f95a9f8d81d912/xarray/coding/times.py#L723C1-L803C32):

  1. The computation converts nanosecond resolution time to int64.

  2. Computes needed time time_delta (represents the time unit granularity based on the specified units), and needed_time_delta (represents the minimum unit granularity required for lossless integer encoding of the dates).

  3. (Optional) If time_delta > needed_time_delta, and we specify None dtype encoding we get the not faithful serialization message, since with integer bit 64 encoding, we will lose precision.

  4. The operation to finalize the encoding is then done in int64.

  5. (Optional) Then, if we actually specify dtype, we cast to dtype (if numpy deems safe).

Now, I am not quite sure at this point why float64 should silence this warning: The casting to float64 is a bit of a 'moot point', since the casting operation is done post-encoding computation, which we know will result in lossful integer encoding. This is probably why None and float64 encoding result in the same values. The float64 dtype does practically nothing to change the encoded array, since it is done at the very last step. Additionally, the time_delta > needed_time_delta also is why we lose precision even when there is no warning (in the case where we set dtype set to float64).

Perhaps this is an issue I bring up in xarray? This still doesn't make sense to me, but I do think it'd be good to silence this error as soon as possible since it can be very disheartening seeing so many time serialization warnings on your first use of Echopype.

On our own end, do you think adding a warning in the logger for this would be a good idea? That way those users that have logger verbose turned on can still get the message.

ctuguinay commented 2 months ago

@leewujung Made that change. This should be ready to merge now