dsgt-birdclef / birdclef-2022

Code for the BirdCLEF 2022 competition by the DS@GT team
2 stars 3 forks source link

Resolve IndexErrors with get_motif_pairs in DataLoader #28

Closed bran22 closed 2 years ago

bran22 commented 2 years ago

The get_motif_pairs method in datasets.py incurs IndexErrors sometimes. This is due to the "neighbor" motif pair not being found if the neighbor is in the last segment of audio in its respective audio file. The audio files are currently being truncated during the slice_seconds routine if the audio file does not evenly divide into 5 second chunks, which leads to some files missing their last chunk.

Ideally, we would pad the last segment so that it is retained and has the same length as the other segments. Perhaps we can have an option to either pad at the end, or "center" the last audio snippet with equal padding at both the beginning and end.