OSOceanAcoustics / echopype

Enabling interoperability and scalability in ocean sonar data analysis
https://echopype.readthedocs.io/
Apache License 2.0
94 stars 73 forks source link

fix(commongrid): fix bugs and improve `compute_NASC` using flox #1167

Closed leewujung closed 9 months ago

leewujung commented 11 months ago

Overview

@simedroniraluca found out about bugs in the previous commongrid.compute_NASC function (see #1150 for discussions), so we temporarily removed it before releasing 0.8.0 (#1136). This PR:

  1. fix the bugs in the previous implementation
  2. improve performance using flox

Tasks

Notes

@lsetiawan: The tests will fail, because the compute_NASC function is not complete (I have not add in channel in the flox statement), but I think the changes give you enough to work with to finish the whole thing. Feel free to direct push to my branch.

I pulled in your changes in #1124 so that I can reuse some functions, so the commit messages and files changed tab are a mess right now, but obviously we'll pull in dev once #1124 is merged to clean this up. All my changes in this PR are only in commongrid/api.py.

For the last item on the task list: I will see if it's possible to get NOAA colleagues' help to generate from test data from Echoview, since I don't have a dongle/license to do it myself.

codecov-commenter commented 11 months ago

Codecov Report

Merging #1167 (f7677af) into dev (ce1e614) will increase coverage by 14.93%. Report is 2 commits behind head on dev. The diff coverage is 96.35%.

@@             Coverage Diff             @@
##              dev    #1167       +/-   ##
===========================================
+ Coverage   81.89%   96.82%   +14.93%     
===========================================
  Files          67        3       -64     
  Lines        5887      189     -5698     
===========================================
- Hits         4821      183     -4638     
+ Misses       1066        6     -1060     
Flag Coverage Δ
unittests 96.82% <96.35%> (+14.93%) :arrow_up:

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
echopype/commongrid/__init__.py 100.00% <100.00%> (ø)
echopype/commongrid/api.py 97.70% <97.22%> (+1.11%) :arrow_up:
echopype/commongrid/utils.py 96.00% <96.00%> (ø)

... and 63 files with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

leewujung commented 11 months ago

@lsetiawan : when testing a prototype of this I ran into the following FutureWarning. Would be good to take care of it here:

FutureWarning: Series.fillna with 'method' is deprecated and will raise in a future version. Use obj.ffill() or obj.bfill() instead.
  df_pos["dist"] = df_pos["dist"].fillna(method="ffill").fillna(method="bfill")

This is in the function nasc.py::_get_distance_from_latlon

lsetiawan commented 11 months ago

https://github.com/OSOceanAcoustics/echopype/pull/1167#issuecomment-1732651809 Fixed in https://github.com/OSOceanAcoustics/echopype/pull/1167/commits/7661fee8ec543eeece3b926d368687787631c692

lsetiawan commented 10 months ago

This should now be ready for 2nd review:

I like the creation of get_x_along_channels! I would suggest changing the triaging to be explicit based on x: x="MVBS" or x="NASC", which then selects x_var under the hood to be ping_time or distance_nmi. The current triaging works but can be pretty confusing in the future.

I've broken this up to _groupby_x_along_channels, compute_raw_MVBS, and compute_raw_NASC to be more explicit for each step. Additionally, now for position variables propagation, I added a _get_reduced_positions function that will add it or not, so this is more visible, rather than hidden.

except for get_distance_from_latlon, everything in nasc.py are no longer used. How creating a utils.py or helper.py and move all functions other than compute_MVBS, compute_MVBS_index_binning, compute_NASC, and regrid there?

Agreed. Now all the helper functions have been moved.

remove _set_MVBS_attrs and use _set_var_attrs directly here

Since the _set_MVBS_attrs is used twice in compute MVBS and the MVBS binning index thing, I think I'm gonna leave this alone to not repeat the 2 steps in there.

remove test_nasc.py since it's now empty :P

Done.

leewujung commented 9 months ago

@lsetiawan : Thanks for the changes, they look great! I love how clean the code is now. I just added the processing level decorator for L4 to compute_NASC. Once the tests all run through I'll merge this PR. Super excited!

Since the _set_MVBS_attrs is used twice in compute MVBS and the MVBS binning index thing, I think I'm gonna leave this alone to not repeat the 2 steps in there.

I agree with this choice.