Closed YBonline closed 1 year ago
I am unable to reproduce this. I see a couple things that can be cleaned up and could potentially be related to this and have created #7291
Alright, I applied the patch. Just restarting 1 camera before the patch didn't do it, and didn't do it after, so I'll have to try to restart some switches and cause network chaos tonight when most people are sleeping to see if I can reproduce it with the patch. I've had it now happen 5 times, all after a network or camera being offline for several minutes. Will report back probably in the morning.
I was able to restart one switch with just cameras on it, and memory is leaking with the patch.... Since the logs were much more reasonable now, I've attached a full log, but nothing is striking me as useful.
While the network switch is out, CPU usage stays at 115-125% and memory usage just continues to rise. Without network switch outages, I usually see extremely low CPU usage.
Weird, I noticed that the audio logs are never printed. I wonder if this is part of the issue, perhaps the audio logs are not being released for whatever reason and that is what is using memory.
I don't believe I've ever seen a log from the audio detector, outside of what you see in that log (process starting, cleanup on shutdown, etc). It does work perfectly if there's no network outages
There should definitely also be logs for it "terminating existing ffmpeg". Something tells me it is not correctly stopping the ffmpeg process.
That seems likely. /api/stats doesn't seem to list the crashed cameras audio ffmpegs ever again once they go offline, even after they restore. I've attached api/stats during the memleak
although maybe not, the crashed cameras don't seem to be here: root# ps ax | grep ffmpeg | grep audio 1598 ? Ssl 0:21 ffmpeg -hide_banner -loglevel warning -threads 2 -rtsp_transport tcp -timeout 5000000 -vn -i rtsp://127.0.0.1:8554/warehouse_exterior_doorbell -f s16le -ar 16000 -ac 1 -y /tmp/cache/warehouse_exterior_doorbell-audio 1602 ? Ssl 0:14 ffmpeg -hide_banner -loglevel warning -threads 2 -rtsp_transport tcp -timeout 5000000 -vn -i rtsp://127.0.0.1:8554/warehouse_exterior_front_sub -f s16le -ar 16000 -ac 1 -y /tmp/cache/warehouse_exterior_front-audio 1605 ? Ssl 0:09 ffmpeg -hide_banner -loglevel warning -threads 2 -rtsp_transport tcp -timeout 5000000 -vn -i rtsp://127.0.0.1:8554/warehouse_workbench_sub -f s16le -ar 16000 -ac 1 -y /tmp/cache/warehouse_workbench-audio 1612 ? Ssl 0:05 ffmpeg -hide_banner -loglevel warning -threads 2 -rtsp_transport tcp -timeout 5000000 -vn -i rtsp://127.0.0.1:8554/warehouse_interior_front_sub -f s16le -ar 16000 -ac 1 -y /tmp/cache/warehouse_interior_front-audio 1659 ? Ssl 0:09 ffmpeg -hide_banner -loglevel warning -threads 2 -rtsp_transport tcp -timeout 5000000 -vn -i rtsp://127.0.0.1:8554/gp636_doorbell -f s16le -ar 16000 -ac 1 -y /tmp/cache/gp636_doorbell-audio 1680 ? Ssl 0:43 ffmpeg -hide_banner -loglevel warning -threads 2 -rtsp_transport tcp -timeout 5000000 -vn -i rtsp://127.0.0.1:8554/gp611_rear -f s16le -ar 16000 -ac 1 -y /tmp/cache/gp611_rear-audio
I see the problem, it is basically sitting at the pipe.open because data is never written to it
@YBonline Okay I just pushed up, that should have the issue printing to logs and not getting stuck in an infinite loop
So just applied your patch, and rebooted same switch as last time, and no infinite loop, no memory leak. However, for some reason, it appears the ffmpeg doesn't seem to want to restart for the audio side now 5 minutes after the video ffmpeg has restored. Infact, I think it might have never started at all? frigate_audio_memleak.txt
It has to have started for the logs to be printed. It's also working for me with this code under normal circumstances. The code specifically checks if the process is currently running.
Aren't the logs I attached indicating it sees the ffmpeg process is not running, and keeps trying to restart it? The AudioManager process is starting, just the audio ffmpeg seemingly isn't for some reason....
It might be a config issue on these cameras, all the ones that are restarting seem to have never gotten an audio event, maybe a codec issue although recordings have audio? I'm investigating now
Yeah it is working fine for me, as soon as the cameras are back online audio events work
Ok, I think I know my mistake here... wanted to detail it just in case others run into it or you want to add something to the documentation about this.
When I ran into the DB locked issue, I noticed in my logs a ton of ffmpeg/AAC codec errors and had just enabled audio events when those errors started. I suspected the audio events were causing it, so I disabled it in the Frigate config, but the codec errors did not disappear, so then I ended up disabling the audio entirely on the cameras while we were working through that DB locked issue to try to keep those errors out of my logs, which worked to get those AAC codec errors gone.
Once you resolved that and I re-enabled the audio, but my AAC codec errors returned. I worked through it, and found that ffmpeg seems to want the sampling rate at 16kHz to not produce the error, and I originally had it at 48kHz. That successfully eliminated the AAC codec errors. For reference, these were with my Hikvision panoramic cameras
However, when I re-enabled the audio recordings on the cameras, I apparently only did it on the main stream, but I am detecting on the sub stream which was still inadvertently disabled.
I've fixed my configuration, rebooted the same switch again, and perfect performance! This bug appears to be fully fixed. Thank you again!
Interesting, the ffmpeg profess should be setting the audio rate but maybe it doesn't like higher rate. My cameras are set to PCMU 8000 and it works without issues with audio events
Anyway, thanks for reporting and glad this could be solved.
I'm just curious, is that a better codec/config then the AAC/16kHz rate?
Its also possible that the Hikvision is buggy with AAC/48kHz, I've found a few other settings that ffmpeg doesn't like on the video side.
On the cameras where I did not mess up the configuration, the audio events have been working perfect with the AAC codec as long as I didn't restart network equipment for several days now! I'm really hoping the detections for glass/crack/shatter/smash/breaking work well, that would make getting timely alerts a LOT easier. I've already found out "smoke_detector" gets vehicles with the backup alarm
I just use it because it's compatible with mse and WebRTC.
And yeah I'm just using speech and bark on a couple cameras right now mainly to get notifications when my dog is barking or people are talking outside my door.
Ah, ok, you know, I was wondering if it was possible within Frigate to get audio on the live streams, I guess that is why I haven't been able to?
Yup, the audio detections have been super accurate for animals. Infact, the animal audio detectors have already detected someone leaving a door open and my chickens getting inside my warehouse, along with my goats escaping their section to go join the chickens in their area lol
I'm gonna have to find some old pieces of glass and break them in front of the cameras to see if that works...
Describe the problem you are having
I'm experiencing a memory leak in Audio Manager when I have the new audio events enabled. From the best I can tell, the issue seems to be related to when a camera that has audio monitoring enabled, and something causes a disconnect from said camera, it leaks memory, even if it reconnects and continues to work.
My most severe memory leak was last night, when I applied a firmware update to my network switches, gateways, etc along with some cameras that I have, causing all of my cameras to go offline multiple times for short periods of time. Frigate eventually crashed, seemingly with nothing in the logs, but when it consumed all system memory in the frigate.audio_manager process.
Today I'm trying to configure another camera, and while the camera is misbehaving and disconnecting, AudioManager seems to be growing and growing in its memory consumption. Its currently consuming 38.8055GB of memory. CPU time from audio_manager also seems to stay around 110% once the leak starts happening, which is way higher then normal.
I have not seen any memory leak without an associated camera network disconnect, and from my testing, it does not appear to be associated with any particular camera model (although the camera must have audio enabled obviously).
AudioDetector P-ID | CPU % | Avg CPU % | Memory % 1580 | 108.0% | 109% | 36.1%
Version
0.13.0-0FD1EAF
Frigate config file
Relevant log output
FFprobe output from your camera
Frigate stats
Operating system
Debian
Install method
Docker Compose
Coral version
PCIe
Network connection
Wired
Camera make and model
NA
Any other information that may be helpful
No response