Closed ctuguinay closed 4 months ago
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 100.00%. Comparing base (
9f56124
) to head (fac7dfc
). Report is 51 commits behind head on main.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
@leewujung I might think of more stuff to change over the weekend, but I think this PR is at a good place now and is ready for review. I also added np.digitize
for upsampling the impulse noise array.
Hey @ctuguinay : Thanks for the PR! Looking at the paper again, I feel that the none of the operations is really so locked in with the requirement of finding the accurate depth (or ping time) bin edges, which is the source of the performance bottleneck, especially the TN one as you pointed out.
All of the criteria are built on the concept of comparing with neighbors, and the binning from accurate bin edges (from the coordinate labels) vs inaccurate bin edges (from just the indices) would likely result in pretty small differences. If the noise needs to be masked, it would get masked either way; if the noise is borderline, the masking is likely be at least equally sensitive to other independent parameter changes (such as changing the averaging bin size along depth or across pings).
What do you think about keeping what you have, but implement a "fast" version that is based on indices? That can be another PR so we can get this merged first?
I just went through the tests and they look in good shape, I just have some small comments. One thing I didn't find was that for TN, there isn't an equivalent for testing the output mask like for IN and AS. It would be good to have that, to confirm that the function works as intended. if computation speed is an issue, you can "crop" the test dataset to only a subset of pings that go across a section containing transient noise, so that it'll run much faster.
@leewujung Thanks for the review! This should be ready to look at again. The only suggestion I deviated a bit from was the suggestion for upsampling using reindex
, since range_sample
assignment is not uniform across channel and ping time (since depth and depth bins aren't either), I had to reindex per channel, per ping time.
I also created issue #1321 for the simple index version of these functions.
Addresses #1314