OceanStreamIO / oceanstream

Oceanstream is a Python library which can be used as a CLI tool to process raw acoustic data from echosounders. It uses echopype as a backend. Developed at @pineviewlabs
https://oceanstream.io
MIT License
4 stars 3 forks source link

AttributeError when using the fielding method on dataset Sv #82

Closed simedroniraluca closed 10 months ago

simedroniraluca commented 10 months ago

Bug 1

Description:

When attempting to use the fielding method on the dataset Sv generated based on the dataset ed_ek_60_for_Sv (referenced in oceanstream conftest.py), and assuming that this Sv is source_Sv, the following code:

FIELDING_DEFAULT_PARAMS = {
    “r0”: 200,
    “r1": 1000,
    “n”: 5,
    “thr”: [2, 0],
    “roff”: 250,
    “jumps”: 5,
    “maxts”: -35,
    “start”: 0,
}
mask_fielding = noise_masks.create_transient_mask(
    source_Sv, parameters=FIELDING_DEFAULT_PARAMS, method=“fielding”
)

Produces the following error:

AttributeError: ‘tuple’ object has no attribute ‘coords’

The traceback points to the function create_transient_mask in noise_masks.py and further to the function get_transient_noise_mask_multichannel in the echopype package.

Steps to reproduce:

  1. Load the dataset ed_ek_60_for_Sv from oceanstream contest.py.
  2. Compute Sv based on ed_ek_60_for_Sv
  3. Assume the resulted dataset as source_Sv.
  4. Run the provided code snippet.

Expected behavior: The mask should be created without any errors.

Actual behavior: An AttributeError is raised indicating that a tuple object does not have the attribute 'coords'.

simedroniraluca commented 10 months ago

Bug 2

Possible similar bug Description:

When attempting to use the create_noise_masks_rapidkrill method on the dataset Sv that was generated based on the dataset ed_ek_80_for_Sv (referenced in oceanstream conftest.py) and subsequently extended using sv_dataset_extension.enrich_sv_dataset, an AttributeError is raised.

Code to Reproduce:

sv_80_ds = sv_computation.compute_sv(ed_ek_80_for_Sv, waveform_mode="CW", encode_mode="complex")
extended_sv_80_ds = sv_dataset_extension.enrich_sv_dataset(sv=sv_80_ds,
                                                           echodata=ed_ek_60_for_Sv, 
                                                           waveform_mode="CW", 
                                                           encode_mode="complex")

rapidkrill_ek80_extended_ds = noise_masks.create_noise_masks_rapidkrill(extended_sv_80_ds)

Expected Behavior:

The create_noise_masks_rapidkrill method should process the extended dataset without any errors.

Actual Behavior:

The following error is raised:

AttributeError: 'numpy.ndarray' object has no attribute 'coords'

Error Traceback:

/Users/simedroniraluca/Documents/pineview/oceanstream/.venv/lib/python3.9/site-packages/echopype/utils/mask_transformation.py:285: RuntimeWarning: invalid value encountered in divide
  datar[i, :] = np.nansum(d * w_, axis=0) / np.nansum(w_, axis=0)
...
File ~/Documents/pineview/oceanstream/oceanstream/L2_calibrated_data/noise_masks.py:126, in create_seabed_mask(Sv, parameters, method)
...
File ~/Documents/pineview/oceanstream/.venv/lib/python3.9/site-packages/echopype/mask/api.py:548, in create_multichannel_mask(masks, channels)
    546 for i in range(0, len(masks)):
    547     mask = masks[i]
--> 548     if "channel" in mask.coords:
    549         masks[i] = mask.isel(channel=0)
    550 result = xr.concat(
    551     masks, Index(channels, name="channel"), data_vars="all", coords="all", join="exact"
    552 )

AttributeError: 'numpy.ndarray' object has no attribute 'coords'

Additional Notes:

Both this issue and a previously reported one revolve around the AttributeError related to the missing coords attribute, indicating a potential underlying problem with the data structures being used or returned by the echopype package.

simedroniraluca commented 10 months ago

Bug 1

Problem Description

In the initial version of the _fielding function, there was an inconsistency in the return types. Specifically, when the searching range was outside the echosounder range, the function returned a tuple of numpy arrays (mask, mask_). However, in all other cases, the function returned an xarray.DataArray.

This inconsistency in return types could lead to errors when the function's output was used in subsequent operations that expected a consistent data type.

Solution

To address this inconsistency, the following modifications were made:

  1. Unified Return Type: Regardless of the condition, the function now always returns an xarray.DataArray. This ensures that any subsequent operations on the function's output can consistently expect an xarray.DataArray.

  2. User Warning: When the searching range is outside the echosounder range, a warning is raised to inform the user that a default mask with all False values is being returned. This default mask won't mask any data points in the dataset, so it's important for the user to be aware of this behavior.

# In the case where the searching range is outside the echosounder range:
if (r0 > r[-1]) or (r1 < r[0]):
    mask = np.zeros_like(Sv, dtype=bool)
    mask_ = np.zeros_like(Sv, dtype=bool)

    # Raise a warning to inform the user
    warnings.warn("The searching range is outside the echosounder range. A default mask with all False values is returned, which won't mask any data points in the dataset.")

    combined_mask = np.logical_or(mask, mask_)
    return xr.DataArray(
        combined_mask,
        dims=("ping_time", "range_sample"),
        coords={"ping_time": source_Sv.ping_time, "range_sample": source_Sv.range_sample},
    )