tphakala / birdnet-go

Realtime BirdNET soundscape analyzer
Other
135 stars 13 forks source link

CPU usage when running realtime #202

Open xconverge opened 2 weeks ago

xconverge commented 2 weeks ago

Is there anything I can tweak that would impact CPU when running realtime?

birdnet-go realtime (with 1 rtsp source) is using more cpu than 1 camera detecting motion with https://github.com/blakeblackshear/frigate and for just audio, that seems a bit high (this is just a gut check thing, not a real metric/concern)

I looked at all of the tickers and there dont seem to be any unnecessary busy loops

This isnt a big deal and I don't have a great reason to dabble here other than "cpu usage seems highish"

Also any pointers into where the processing mostly is happening would be interesting as I go to maybe poke around a bit more.

Audio recording -> chunking -> analyzing -> processing the results

my guess is running the tflite model is where most of the cpu is, with audio recording/io close behind it?

tphakala commented 2 weeks ago

BirdNET analyzer works by analysing audio in 3 second chunks. In BirdNET-Go audio captured from any source (RTSP or soundcard) is fed to ring buffer which is polled by go routine (myaudio/buffer.go BufferMonitor()). Data is fed to BirdNET analyzer in 3 second chunks but buffer reader (readFromBuffer()) uses sliding window to to read data from ring buffer every 1.5 seconds, this done to apply an overlap for chuncks so that birdcalls which happen just at edges of chunks are not missed. Downside is that this doubles processing done by BirdNET tflite model, and yes tflite model is where most of CPU usage is spent.

Adjusting overlapSize in myaudio/buffer.go will impact sliding window scale, setting it to 0 will effectively disable sliding window and halve CPU usage. This parameter has been fixed sofar since every 64-bit Raspberry Pi platform (RPI 3 and up) has been capable of doing realtime analysis with this value, however now that multiple RTSP sources are supported it would be beneficial to make this realtime analyzis overlap setting user configurable.

There is a accelerator module XNNPACK for TensorFlow Lite which should accelerate CPU based workloads, however when I tried to use it with BirdNET model tflite always crashed, I haven't tested with tflite 2.15 version though.

xconverge commented 2 weeks ago

What would be the downsides of going from 3->30 seconds per chunk? I know you would lose the "realtimish-ness" but it would still be plenty realtime for my needs...

I do agree that making these overridable/settable via config would be excellent and hopefully not too overwhelming if most people dont need to tweak them

tphakala commented 2 weeks ago

It is not supported by model we are using, https://github.com/kahst/BirdNET-Analyzer/issues/288#issuecomment-2062127265

tphakala commented 2 weeks ago

BirdNET ovelap setting now applies to realtime detection also, setting overlap value lower than 1.5 will reduce CPU usage consumed by tflite model. This was added to v05.5. https://github.com/tphakala/birdnet-go/discussions/205

xconverge commented 1 week ago

You can close this if you want, you answered my questions and gave me a knob to adjust if I feel like it!

I was previously running birdcage and the dynamic segment lengths for that model was probably the main difference (30 second clips I think it was analyzing)

tphakala commented 1 week ago

I reviewed the Birdcage source and found that it operates similarly to BirdNET-Pi. Both systems record audio to disk in user configured lengths (30-second clips in your case) and then analyze these files. The analysis involves the BirdNET analyzer, which breaks down these clips into 3-second segments for BirdNET model. Thus, the segment length for analysis is the same for both Birdcage and BirdNET-Go.

However, BirdNET-Go differs in how it handles the captured audio data. Instead of writing the audio to disk and waiting for file analysis, BirdNET-Go stores the audio in a RAM buffer, which is then fed to BirdNET model in 3-second chunks. This approach significantly reduces unnecessary write activity on SD cards, preserving their longevity.