lifewatch / pypam

Python Passive Acoustic Analysis tool for Passive Acoustic Monitoring (PAM)
GNU General Public License v3.0
39 stars 8 forks source link

acoustic survey not completing #25

Closed ryjombari closed 1 year ago

ryjombari commented 1 year ago

Working with Carlos to incrementally test pypam processing of hybrid millidecade spectra, I set up a directory containing 20 1-minute wav files. This allows us to ensure that both input segmentation and output hms are correct. I am able to run acoustic survey and watch it progress successfully through all 20 1-minute files, but the processing fails to complete. Below is the primary pypam session info.

In [9]: wav_dir = '/Volumes/PAM_Analysis/pypam-space/InputMinutes'   # (20 1-minute files)
   ...: asa = acoustic_survey.ASA(hydrophone=hphone, folder_path=wav_dir, nfft=nfft, binsize=binsize, bin_overlap=bin_overlap, fft_overlap=fft_overlap)
   ...: hm = asa.hybrid_millidecade_bands(db=True, method=method, band=band)
   ...: 
  0%|                                                                                                                                            | 0/20 [00:00<?, ?it/s]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_000000.wav
  5%|██████▌                                                                                                                             | 1/20 [00:01<00:23,  1.23s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_000100.wav
 10%|█████████████▏                                                                                                                      | 2/20 [00:02<00:21,  1.19s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_000200.wav
 15%|███████████████████▊                                                                                                                | 3/20 [00:03<00:20,  1.18s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_000300.wav
 20%|██████████████████████████▍                                                                                                         | 4/20 [00:04<00:18,  1.17s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_000400.wav
 25%|█████████████████████████████████                                                                                                   | 5/20 [00:05<00:17,  1.17s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_000500.wav
 30%|███████████████████████████████████████▌                                                                                            | 6/20 [00:07<00:16,  1.17s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_000600.wav
 35%|██████████████████████████████████████████████▏                                                                                     | 7/20 [00:08<00:15,  1.17s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_000700.wav
 40%|████████████████████████████████████████████████████▊                                                                               | 8/20 [00:09<00:14,  1.17s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_000800.wav
 45%|███████████████████████████████████████████████████████████▍                                                                        | 9/20 [00:10<00:12,  1.17s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_000900.wav
 50%|█████████████████████████████████████████████████████████████████▌                                                                 | 10/20 [00:11<00:11,  1.17s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_001000.wav
 55%|████████████████████████████████████████████████████████████████████████                                                           | 11/20 [00:12<00:10,  1.17s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_001100.wav
 60%|██████████████████████████████████████████████████████████████████████████████▌                                                    | 12/20 [00:14<00:09,  1.17s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_001200.wav
 65%|█████████████████████████████████████████████████████████████████████████████████████▏                                             | 13/20 [00:15<00:08,  1.17s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_001300.wav
 70%|███████████████████████████████████████████████████████████████████████████████████████████▋                                       | 14/20 [00:16<00:07,  1.17s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_001400.wav
 75%|██████████████████████████████████████████████████████████████████████████████████████████████████▎                                | 15/20 [00:17<00:05,  1.17s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_001500.wav
 80%|████████████████████████████████████████████████████████████████████████████████████████████████████████▊                          | 16/20 [00:18<00:04,  1.17s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_001600.wav
 85%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████▎                   | 17/20 [00:19<00:03,  1.17s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_001700.wav
 90%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▉             | 18/20 [00:21<00:02,  1.18s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_001800.wav
 95%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍      | 19/20 [00:22<00:01,  1.18s/it]/Volumes/PAM_Analysis/pypam-space/InputMinutes/MARS_20220902_001900.wav
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:23<00:00,  1.18s/it]
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[9], line 3
      1 wav_dir = '/Volumes/PAM_Analysis/pypam-space/InputMinutes'   # (20 1-minute files)
      2 asa = acoustic_survey.ASA(hydrophone=hphone, folder_path=wav_dir, nfft=nfft, binsize=binsize, bin_overlap=bin_overlap, fft_overlap=fft_overlap)
----> 3 hm = asa.hybrid_millidecade_bands(db=True, method=method, band=band)

File ~/PAM/tutorial-env/lib/python3.10/site-packages/pypam/acoustic_survey.py:346, in ASA.hybrid_millidecade_bands(self, db, method, band, percentiles)
    344 bands_limits, bands_c = utils.get_hybrid_millidecade_limits(band=band, nfft=self.nfft)
    345 fft_bin_width = spectra_ds.attrs['fs'] / self.nfft
--> 346 milli_spectra = utils.spectra_ds_to_bands(spectra_ds['band_%s' % method],
    347                                       bands_limits, bands_c, fft_bin_width=fft_bin_width, db=db)
    349 # Add the millidecade
    350 spectra_ds['millidecade_bands'] = milli_spectra

File ~/PAM/tutorial-env/lib/python3.10/site-packages/pypam/utils.py:425, in spectra_ds_to_bands(psd, bands_limits, bands_c, fft_bin_width, db)
    423 # Bin the bands and add the borders
    424 psd_without_borders = psd.drop_isel(frequency=fft_freq_indices)
--> 425 psd_bands = psd_without_borders.groupby_bins('frequency', bins=bands_limits, labels=bands_c, right=False).sum()
    426 psd_bands = psd_bands.fillna(0)
    427 psd_bands = psd_bands + psd_limits_lower.values + psd_limits_upper.values

File ~/PAM/tutorial-env/lib/python3.10/site-packages/xarray/core/dataarray.py:6354, in DataArray.groupby_bins(self, group, bins, right, labels, precision, include_lowest, squeeze, restore_coord_dims)
   6297 """Returns a DataArrayGroupBy object for performing grouped operations.
   6298 
   6299 Rather than using all unique values of `group`, the values are discretized
   (...)
   6350 .. [1] http://pandas.pydata.org/pandas-docs/stable/generated/pandas.cut.html
   6351 """
   6352 from xarray.core.groupby import DataArrayGroupBy
-> 6354 return DataArrayGroupBy(
   6355     self,
   6356     group,
   6357     squeeze=squeeze,
   6358     bins=bins,
   6359     restore_coord_dims=restore_coord_dims,
   6360     cut_kwargs={
   6361         "right": right,
   6362         "labels": labels,
   6363         "precision": precision,
   6364         "include_lowest": include_lowest,
   6365     },
   6366 )

File ~/PAM/tutorial-env/lib/python3.10/site-packages/xarray/core/groupby.py:404, in GroupBy.__init__(self, obj, group, squeeze, grouper, bins, restore_coord_dims, cut_kwargs)
    402 if duck_array_ops.isnull(bins).all():
    403     raise ValueError("All bin edges are NaN.")
--> 404 binned, bins = pd.cut(group.values, bins, **cut_kwargs, retbins=True)
    405 new_dim_name = str(group.name) + "_bins"
    406 group = DataArray(binned, getattr(group, "coords", None), name=new_dim_name)

File ~/PAM/tutorial-env/lib/python3.10/site-packages/pandas/core/reshape/tile.py:293, in cut(x, bins, right, labels, retbins, precision, include_lowest, duplicates, ordered)
    290     if (np.diff(bins.astype("float64")) < 0).any():
    291         raise ValueError("bins must increase monotonically.")
--> 293 fac, bins = _bins_to_cuts(
    294     x,
    295     bins,
    296     right=right,
    297     labels=labels,
    298     precision=precision,
    299     include_lowest=include_lowest,
    300     dtype=dtype,
    301     duplicates=duplicates,
    302     ordered=ordered,
    303 )
    305 return _postprocess_for_cut(fac, bins, retbins, dtype, original)

File ~/PAM/tutorial-env/lib/python3.10/site-packages/pandas/core/reshape/tile.py:420, in _bins_to_cuts(x, bins, right, labels, precision, include_lowest, dtype, duplicates, ordered)
    418 if len(unique_bins) < len(bins) and len(bins) != 2:
    419     if duplicates == "raise":
--> 420         raise ValueError(
    421             f"Bin edges must be unique: {repr(bins)}.\n"
    422             f"You can drop duplicate edges by setting the 'duplicates' kwarg"
    423         )
    424     else:
    425         bins = unique_bins

ValueError: Bin edges must be unique: array([-3.90625000e-01,  3.90625000e-01,  1.17187500e+00, ...,
        9.97700064e+04,  1.00000000e+05,  1.00000000e+05]).
You can drop duplicate edges by setting the 'duplicates' kwarg
ryjombari commented 1 year ago

I found what caused the error: specifying band = [10 100000] instead of [0 128000]. I thought that the specification of band within the full range of the psd calculation would simply subset the output. This subsetting can be done after computation is completed, so no issue here unless you think this error should not occur with specification of a subset of the full band width of the calculation (0 to Nyquist).