ENH: Add back clipping of splits for learning curves to target durations

vocalpy / vak

A neural network framework for researchers studying acoustic communication

BSD 3-Clause "New" or "Revised" License

78 stars 16 forks source link

When we make splits now for learning curves, we do not clip the vectors we use for bookkeeping so that they are a consistent length, i.e., we do not train on fixed durations

This logic lived on the WindowDataset class in version 0.x, the crop_spect_vectors_keep_classes label: https://github.com/vocalpy/vak/blob/0.8/src/vak/datasets/window_dataset.py#L246

I just rewrote some of this logic for the BioSoundSegBench dataset, here: https://github.com/vocalpy/BioSoundSegBench/commit/f8a6b28cee1612e0a0f4e3ba5e82e9ebea67bd68

In doing so I realized that the duration as measured in seconds of audio can differ from the duration as measured in number of spectrogram time bins, and that this difference varies depending on the method used to compute the spectrogram

I ended up using some hacks so that we get indexing vectors of (mostly) consistent lengths. But it's annoyingly fragile.

Probably the better way to do this from first principles is to clip the audio in such a way that we get the target duration in seconds--while keeping all classes present in the dataset--and then let the spectrogram code do whatever it wants

vocalpy / vak

ENH: Add back clipping of splits for learning curves to target durations #773