Audio object is shorter than requested duration error could point to the particular sample causing the error

kitzeslab / opensoundscape

Open source, scalable software for the analysis of bioacoustic recordings

MIT License

126 stars 14 forks source link

Hello, Thanks for the great work you're doing.

I'm encountering an issue when training models using opensoundscape: occasionally there will be a file in one of my datasets which causes the following error:

UserWarning: Audio object is shorter than requested duration: 2.8619727891156463 sec instead of 3.0 sec.

When these errors are present, the model doesn't learn at all during training, and progress stalls at chance rates. When I remove the files which generate the short audio objects, the model is able to improve during training.

My current workaround has been to edit the source code in the from_file() function of audio.py like so:

error_msg = ( f"Audio object is shorter than requested duration: " f"{len(samples)/sr} sec instead of {duration} sec. Path: {path}, start_time: {offset} sec, end_time {offset + duration} sec." )

This lets me find the audio clip which is causing the error and remove it from my dataset.

If this is an acceptable workaround, I can make a pull request with this change, but I'd like to check whether there's an existing way to deal with this?

Thanks,

Michael.

m.preprocessor.insert_action( action_index="extend", action=Action( Audio.extend_to, is_augmentation=False, duration=self.sample_duration ), after_key='load_audio' )

kitzeslab / opensoundscape

Audio object is shorter than requested duration error could point to the particular sample causing the error #979