preprocess.py doesn't chunk mp3 as expected

OSU-Bee-Lab / buzzdetect

a machine learning tool to detect and classify bee buzzes in audio

GNU General Public License v3.0

4 stars 1 forks source link

preprocess.py doesn't chunk mp3 as expected #5

Closed LukeHearon closed 12 months ago

LukeHearon commented 1 year ago

The length of audio files of the same duration read by AudioSegment will differ by filetype. In my wavs, the length is equal to the duration of the file in milliseconds (roughly; in my case a 1h file comes out as 3,600,005 where you would expect 3,600,000). In mp3s, it's...unknown to me at the moment.

chunk_audio chunks by the length of the AudioSegment object, so wavs chunk as expected, but mp3s do not. mp3 lengths are much smaller than expected (675,029 for 1h audio), making the chunkLength_hr argument produce much larger chunks than intended.

LukeHearon commented 1 year ago

I believe this has something to do with file size.

My wav file: 115,200,152 bytes; bytes/object length → 115,200,152/3,600,005 ≈ 32 My raw file: 1,073,567,316 bytes; bytes/object length → 1,073,567,316/33548979 ≈ 32

So in a sense, is it a happy accident of our sample rate and bit depth that every 1 length is 1 ms audio?

This question feels extremely elementary for someone with legitimate CS education to solve...

LukeHearon commented 1 year ago

Possible solution: get total audio duration of raw mp3 to find length per desired chunk duration and call that chunk size? mp3s should have varying sizes for 1h audio due to compression, but the chunks will be close enough.

Possible solution: Perhaps it's even better that we chunk by file size instead of duration. Just chunk everything to 1GB?

LukeHearon commented 1 year ago

Gah the issue is I need the chunks to have their timestamps in the filename (or recoverable somehow), so chunking by bytes isn't acceptable unless I can translate bytes→time.

LukeHearon commented 12 months ago

Resolved by chunking via ffmpeg