OSOceanAcoustics / echopype

Enabling interoperability and scalability in ocean sonar data analysis
https://echopype.readthedocs.io/
Apache License 2.0
99 stars 73 forks source link

Add function to regrid Sv data #726

Open leewujung opened 2 years ago

leewujung commented 2 years ago

It is a pretty common need to regrid Sv data, since different channels (frequencies) may be configured differently to collect data with different sample_interval even if the total range collected is the same.

Proposal: add this to the commongrid subpackage.

leewujung commented 1 year ago

@leewujung will add a link here to a ref implementation from IMR.

lsetiawan commented 1 year ago

This should be optional... potentially use: https://xesmf.readthedocs.io/en/latest/index.html

leewujung commented 1 year ago

I found my notes from before. Here are two resources, one from IMR and the other pyEcholab.

IMR:

pyEcholab:

lsetiawan commented 12 months ago

@leewujung Could you point us to a test data for this?

leewujung commented 12 months ago

I haven't generated any datasets for this. I think to generate outputs from the IMR functions and test against it would be the way to go. The input would be Sv with potentially irregular spacing in ping time and depth across channels, and output would be Sv gridded to the same ping time and depth grid across all channels.

anantmittal commented 11 months ago

Explore mock_Sv_dataset_irregular and mock_Sv_dataset_regular fixture in echopype/tests/commongrid/conftest.py

anantmittal commented 11 months ago

@leewujung How should we decide the spacing in ping time for regridding?

anantmittal commented 11 months ago

@leewujung How should we decide the spacing in ping time for regridding?

Assuming the regrid function takes ds_Sv, range_bin, and ping_time_bin as arguments.

leewujung commented 11 months ago

@anantmittal : Not sure if I understand the question, but I think in terms of uses, it may be the most useful if users specify ds_Sv, range_wanted, ping_time_wanted as input arguments, and get ds_Sv_out with range_wanted and/or ping_time_wanted as the coordinates across all channels.

One common use case is when different channels have different sizes along the range_sample dimension, and a user wants to align them all according to one of the channels. In that case, the call would look something like:

ds_Sv_out = ep.consolidate.regrid(
   ds_Sv = ds_Sv,
   range_wanted = ds_Sv["range"].isel(channel=0, ping_time=0)  # use range of the first ping in first channel
)

A couple notes:

anantmittal commented 11 months ago

In the the above case ping_time_wanted is not specified. It is pretty common to have all channels to ping on the same time base, but more recently the instruments are much more flexible so this may not be true. In that case, users may want to specify that also to get ds_Sv_out to align on both range_wanted and ping_time_wanted.

@leewujung: Could you explain this comment a bit more? It's unclear how regridding/interpolation should work when both range_wanted and ping_time_wanted are supplied to the regrid function.

1241 is my stab at implementation when ping_time_wanted is None.

leewujung commented 10 months ago

@anantmittal @lsetiawan : This following paragraph may be useful to understand the process? It also points to the specific approach they use and references in iris from this Sec. 2.2.2 in this paper.

Screenshot 2023-12-16 at 8 41 34 AM
lsetiawan commented 9 months ago

xESMF Regrid Methods explanations: http://earthsystemmodeling.org/regrid/#regridding-methods

Actual algorithms in the xESMF Python library: https://xesmf.readthedocs.io/en/latest/notebooks/Compare_algorithms.html

Tutorial on xarray regridding with xESMF: https://xesmf.readthedocs.io/en/latest/notebooks/Dataset.html