Very large audio files should be loaded in sections rather than loading entirely into memory

joeweiss / birdnetlib

A python api for BirdNET-Lite and BirdNET-Analyzer

https://joeweiss.github.io/birdnetlib/

Apache License 2.0

41 stars 14 forks source link

Very large audio files should be loaded in sections rather than loading entirely into memory #97

Closed joeweiss closed 11 months ago

joeweiss commented 11 months ago

Currently (as of 0.12.3), segmenting large audio files (> 1h) into more manageable segments is left to the user. If a user was to attempt to process a large audio file, the entire file would be pulled into memory before analyzing. This can lead to OOM killer events, or process crashes.

The library needs to have a method for processing very large audio files.

joeweiss commented 11 months ago

I've added an initial implementation by subclassing the default objects as LargeRecording and LargeRecordingAnalyzer. If these prove to be as performant as the default objects, I'll consider making these the default in 1.0.