Closed miczlo closed 1 year ago
I would recommend having frigate connect to the restream. The way you have it setup now, frigate will connect to the same stream multiple times: https://deploy-preview-4055--frigate-docs.netlify.app/configuration/restream#reduce-connections-to-camera
Also, you said that office_tapo
wasn't set to record, but your config indicates otherwise. Turning record off in the UI does not stop the record process from running in frigate. It just prevents them from being stored.
I would recommend having frigate connect to the restream. The way you have it setup now, frigate will connect to the same stream multiple times: https://deploy-preview-4055--frigate-docs.netlify.app/configuration/restream#reduce-connections-to-camera
Also, you said that
office_tapo
wasn't set to record, but your config indicates otherwise. Turning record off in the UI does not stop the record process from running in frigate. It just prevents them from being stored.
I tried the restream but while in vlc it works fine for me doing as per the docs with localhost access gave me plenty of errors so I gave up with that. In terms of office_tapo - I guess you're correct, but I'm more worried about the remaining cameras which are missing bits. I have just grabbed a backup of my config where I think it was just the 'highres' cameras with blanks and I'll see how this behaves with 0.11. I have also disabled WiFi completely to make sure strange things are not happening in the background with it.
Segments will go missing when there are ffmpeg errors. Many of the errors you are seeing could be due to the extra connection and bandwidth that is currently being made. That's why I suggested using restream. Go2rtc also has better error handling for stream processing.
Segments will go missing when there are ffmpeg errors. Many of the errors you are seeing could be due to the extra connection and bandwidth that is currently being made. That's why I suggested using restream. Go2rtc also has better error handling for stream processing.
Ok so to test I moved back (fresh db) to v0.11. Similar behaviour was observed with some periods missing. I removed all the cameras which were having any issues in the logs. Situation improved and I'd say all the data was getting recorded. This is problematic I'd say as indicated that a problematic camera can cause other cameras not to record correctly.
I upgraded to v0.12 and again removed the cameras which were mentioned in the logs but I'm still seeing segments missing with the logs relatively healthy.
2023-01-13 16:14:37.277451027 [2023-01-13 16:14:37] frigate.app INFO : Starting Frigate (0.12.0-cf2466c)
2023-01-13 16:14:37.359917289 [2023-01-13 16:14:37] peewee_migrate INFO : Starting migrations
2023-01-13 16:14:37.364437633 [2023-01-13 16:14:37] peewee_migrate INFO : There is nothing to migrate
2023-01-13 16:14:37.372669770 [2023-01-13 16:14:37] ws4py INFO : Using epoll
2023-01-13 16:14:37.414967560 [2023-01-13 16:14:37] frigate.app INFO : Output process started: 163
2023-01-13 16:14:37.427043790 [2023-01-13 16:14:37] ws4py INFO : Using epoll
2023-01-13 16:14:37.443755606 [2023-01-13 16:14:37] frigate.app INFO : Camera processor started for back_right_bottom: 167
2023-01-13 16:14:37.444364217 [2023-01-13 16:14:37] frigate.comms.dispatcher INFO : Turning off recordings for living_room_tapo
2023-01-13 16:14:37.444366980 [2023-01-13 16:14:37] frigate.comms.dispatcher INFO : Turning off snapshots for living_room_tapo
2023-01-13 16:14:37.444370402 [2023-01-13 16:14:37] frigate.comms.dispatcher INFO : Turning off detection for living_room_tapo
2023-01-13 16:14:37.444372106 [2023-01-13 16:14:37] frigate.comms.dispatcher INFO : Turning off recordings for living_room_2_tapo
2023-01-13 16:14:37.448795973 [2023-01-13 16:14:37] frigate.app INFO : Camera processor started for back_gazebo: 173
2023-01-13 16:14:37.449987465 [2023-01-13 16:14:37] frigate.comms.dispatcher INFO : Turning off snapshots for living_room_2_tapo
2023-01-13 16:14:37.451982004 [2023-01-13 16:14:37] frigate.comms.dispatcher INFO : Turning off detection for living_room_2_tapo
2023-01-13 16:14:37.477593856 [2023-01-13 16:14:37] frigate.app INFO : Camera processor started for back_left: 175
2023-01-13 16:14:37.499605489 [2023-01-13 16:14:37] frigate.app INFO : Camera processor started for back_tree: 177
2023-01-13 16:14:37.531859957 [2023-01-13 16:14:37] frigate.app INFO : Camera processor started for back_gate: 178
2023-01-13 16:14:37.537621767 [2023-01-13 16:14:37] frigate.app INFO : Camera processor started for front_porch: 181
2023-01-13 16:14:37.568082025 [2023-01-13 16:14:37] frigate.app INFO : Camera processor started for front_right_HighRes: 182
2023-01-13 16:14:37.592968022 [2023-01-13 16:14:37] frigate.app INFO : Camera processor started for front_left_low: 185
2023-01-13 16:14:37.614102132 [2023-01-13 16:14:37] frigate.app INFO : Camera processor started for living_room_2_tapo: 187
2023-01-13 16:14:37.631893031 [2023-01-13 16:14:37] frigate.app INFO : Camera processor started for living_room_tapo: 190
2023-01-13 16:14:37.649236418 [2023-01-13 16:14:37] frigate.app INFO : Capture process started for back_right_bottom: 192
2023-01-13 16:14:37.660575432 [2023-01-13 16:14:37] frigate.app INFO : Capture process started for back_gazebo: 195
2023-01-13 16:14:37.674285908 [2023-01-13 16:14:37] frigate.app INFO : Capture process started for back_left: 200
2023-01-13 16:14:37.687885531 [2023-01-13 16:14:37] frigate.app INFO : Capture process started for back_tree: 209
2023-01-13 16:14:37.707875080 [2023-01-13 16:14:37] frigate.app INFO : Capture process started for back_gate: 220
2023-01-13 16:14:37.739880871 [2023-01-13 16:14:37] frigate.app INFO : Capture process started for front_porch: 225
2023-01-13 16:14:37.778521326 [2023-01-13 16:14:37] frigate.app INFO : Capture process started for front_right_HighRes: 234
2023-01-13 16:14:37.809428152 [2023-01-13 16:14:37] frigate.app INFO : Capture process started for front_left_low: 239
2023-01-13 16:14:37.840132218 [2023-01-13 16:14:37] frigate.app INFO : Capture process started for living_room_2_tapo: 247
2023-01-13 16:14:37.867389730 [2023-01-13 16:14:37] frigate.app INFO : Capture process started for living_room_tapo: 250
2023-01-13 16:14:40.089327642 [2023-01-13 16:14:37] detector.coral INFO : Starting detection process: 162
2023-01-13 16:14:40.089331216 [2023-01-13 16:14:37] frigate.detectors.plugins.edgetpu_tfl INFO : Attempting to load TPU as usb
2023-01-13 16:14:40.104978770 [2023-01-13 16:14:40] frigate.detectors.plugins.edgetpu_tfl INFO : TPU found
2023-01-13 16:23:16.758330856 [2023-01-13 16:23:16] ws4py INFO : Managing websocket [Local => 127.0.0.1:5002 | Remote => 127.0.0.1:55316]
2023-01-13 16:34:29.507937623 [2023-01-13 16:34:29] frigate.object_processing WARNING : Unable to create jpg because frame 1673627665.784194 is not in the cache
2023-01-13 16:34:30.820867098 [2023-01-13 16:34:30] frigate.object_processing WARNING : Unable to create jpg because frame 1673627665.784194 is not in the cache
2023-01-13 16:34:31.019906781 [2023-01-13 16:34:31] frigate.object_processing WARNING : Unable to create jpg because frame 1673627665.784194 is not in the cache
2023-01-13 16:34:31.117005940 [2023-01-13 16:34:31] frigate.object_processing WARNING : Unable to create jpg because frame 1673627665.784194 is not in the cache
2023-01-13 16:34:31.296003331 [2023-01-13 16:34:31] frigate.object_processing WARNING : Unable to create jpg because frame 1673627665.784194 is not in the cache
2023-01-13 16:34:34.972922394 [2023-01-13 16:34:34] frigate.object_processing WARNING : Unable to create jpg because frame 1673627665.784194 is not in the cache
2023-01-13 16:34:34.974877907 [2023-01-13 16:34:34] frigate.object_processing WARNING : Unable to create jpg because frame 1673627665.784194 is not in the cache
2023-01-13 16:34:37.609119151 [2023-01-13 16:34:37] frigate.object_processing WARNING : Unable to create jpg because frame 1673627665.784194 is not in the cache
2023-01-13 16:34:39.143018291 [2023-01-13 16:34:39] frigate.object_processing WARNING : Unable to create jpg because frame 1673627665.784194 is not in the cache
2023-01-13 16:34:39.840756551 [2023-01-13 16:34:39] frigate.object_processing WARNING : Unable to create jpg because frame 1673627665.784194 is not in the cache
2023-01-13 16:34:39.840760092 [2023-01-13 16:34:39] frigate.object_processing WARNING : Unable to save snapshot for 1673627639.044998-syevxt.
2023-01-13 16:34:39.840761841 [2023-01-13 16:34:39] frigate.object_processing WARNING : Unable to create clean png because frame 1673627665.784194 is not in the cache
2023-01-13 16:34:39.840763447 [2023-01-13 16:34:39] frigate.object_processing WARNING : Unable to save clean snapshot for 1673627639.044998-syevxt.
2023-01-13 16:34:39.842305517 [2023-01-13 16:34:39] frigate.object_processing WARNING : Unable to create jpg because frame 1673627665.784194 is not in the cache
2023-01-13 16:37:14.758232893 [2023-01-13 16:37:14] frigate.object_processing WARNING : Unable to create jpg because frame 1673627833.43635 is not in the cache
2023-01-13 16:37:22.614245799 [2023-01-13 16:37:22] frigate.object_processing WARNING : Unable to create jpg because frame 1673627840.211069 is not in the cache
2023-01-13 16:37:22.670448333 [2023-01-13 16:37:22] frigate.object_processing WARNING : Unable to create jpg because frame 1673627833.43635 is not in the cache
2023-01-13 16:37:22.670452757 [2023-01-13 16:37:22] frigate.object_processing WARNING : Unable to save snapshot for 1673626919.048279-5rhhay.
2023-01-13 16:37:22.670454511 [2023-01-13 16:37:22] frigate.object_processing WARNING : Unable to create clean png because frame 1673627833.43635 is not in the cache
2023-01-13 16:37:22.670456251 [2023-01-13 16:37:22] frigate.object_processing WARNING : Unable to save clean snapshot for 1673626919.048279-5rhhay.
2023-01-13 16:37:22.770799069 [2023-01-13 16:37:22] frigate.object_processing WARNING : Unable to create jpg because frame 1673627833.43635 is not in the cache
2023-01-13 16:37:33.158554890 [2023-01-13 16:37:33] ws4py INFO : Terminating websocket [Local => 127.0.0.1:5002 | Remote => 127.0.0.1:55316]
2023-01-13 16:37:33.389364832 [2023-01-13 16:37:33] ws4py INFO : Managing websocket [Local => 127.0.0.1:5002 | Remote => 127.0.0.1:34090]
I then removed the highres camera completely (before I removed detect which was all over the logs) and it seems better as in other cameras don't seem to be loosing segments but if it is causing issues with other cameras I'd at least expect some indication in the logs that something is going on plus I can survive detection not working but I'd like to be able to at least try to record the camera and if there are frames causing issues I'd like it not to affect other cameras but just this one. Any suggestions? Only thing I can think of for now is another instance of frigate for the troublesome ones.
Edit: Ok I tried that on the same hardware but it drove the coral inference on the first container from 12ms or so to about 100ms I guess I'm running out of cpu maybe on this machine.
I tried overnight and it seemed fine without the few cameras. I added one of the Reolink 810 back and even though I didn't see any log entries regarding it (had some regarding another tapo) I was missing data again. I have removed the Reolink again to see if this stabilizes it again. Currently experimenting with beta but if I still struggle I will go back to see if it's better on v0.11 as I don't recall having these issues before.
Ok I think this might all be related to high CPU usage that I noticed which wasn't there in the past with 2 things contributing.
Some time ago I started implementing face recognition on a couple cameras since I moved frigate to a more powerful hardware (6th gen i5 to 8th gen i7) which was fine at the time and didn't cause problems (I think 65% usage or so on power saver settings) but looking at the CPU usage today, switching to 5mp detection on just one camera got me from the region of 70% to constant 100% usage (according to psutil). Here I noticed one core only being displayed and same for any other tool that checks the cpu in any way which is the other part of the issue. I had some issue with my linux and following advise I added apm=off and acpi=off that resolved the problem I had (no clue what it was) but my system started detecting only one core throwing the usage higher. Removing these again got me to 15% - 20% area on performance setting. I'll re-add the cameras and check if this is resolved.
Yup that's seems to have been it. The missing segments were caused by a high CPU load so not much to do with frigate. It would be nice if there was a notification in frigate somewhere to inform the user that this might be happening though if possible.
I will have to do some testing to try and reproduce this. We already notify in the logs when recording segments are being discarded and any time the record process crashes. I'm not sure how segments could be missing without these error messages.
I will have to do some testing to try and reproduce this. We already notify in the logs when recording segments are being discarded and any time the record process crashes. I'm not sure how segments could be missing without these error messages.
It's because Frigate silently discards valid recording segments when there's more than 5 segments in the cache!
It's the issue in this PR: https://github.com/blakeblackshear/frigate/pull/3439 Which unfortunately had to be reverted (see the comment): https://github.com/blakeblackshear/frigate/commit/3c46a33992470f8b5e418f297d5011a692dd66ed
Even if discarded segments were logged, the unpredictable frequency of when it happens means users who do not regularly inspect the logs won't be aware they have periods of lost video until they try to view the recordings with discarded segments. So there probably needs to be a robust way to ensure valid recording segments are never discarded as well as logging.
Can we re-open this issue?
I see. Now I remember why it's not possible to differentiate between when segments are discarded because they didn't have any motion or active objects and when they are discarded because the CPU is behind and unable to process the frame queue fast enough. I will have to see if I can think of a way. I think we would have to only report this when record mode is all.
I too am experiencing this. Sometimes as much as 15 minutes at a time. I have 10days 24/7 recording retention and 15 on events.
I too am experiencing this. Sometimes as much as 15 minutes at a time. I have 10days 24/7 recording retention and 15 on events.
I just looked back and it's actually worse. each hour is represented by about a 5 minute video. I am using mode: all. And correct me if I am wrong, but if frigat records in ~10sec segements each hour folder should contain ~360(6/min) clips no? I only have about 60-65 per. most are seqential, but either jump around or just cover a 4-6 min period.
That's exactly what I was experiencing - what's your CPU usage like?
That's exactly what I was experiencing - what's your CPU usage like?
total about 50%(per the system screen) with 5HD cameras. and unless thats 50% of my entire cpu (which is insane and unlikely) thats only half of one of my 6core/12threads.
when I pull up htop, frigate is barely using anything. hell, my overall usuage is barely anything.
My SHM is set to 256M for giggles, but the 64M was never a problem until 0.12.
My cache is also 1000MB if that makes any difference.
It's only partially about CPU usage. Each python process can only use a single CPU. The recording maintenance is a thread that competes for execution time with a bunch of other processes too. In future versions we are planning to rearchitect parts of frigate so the load is spread across more independent processes. Another variable could be disk speed. If it takes too long to move the current segments out of cache, then it isn't able to keep up.
It's only partially about CPU usage. Each python process can only use a single CPU. The recording maintenance is a thread that competes for execution time with a bunch of other processes too. In future versions we are planning to rearchitect parts of frigate so the load is spread across more independent processes. Another variable could be disk speed. If it takes too long to move the current segments out of cache, then it isn't able to keep up
What changed between 0.11 and 0.12 to cause this though? Recordings were working just fine until i updated. I assume I don't need to change to an SSD (not at all desirable) to fix this?
Can you try turning on debug logging for the record process?
logger:
logs:
frigate.record: debug
You should see messages about the time it takes to copy each segment.
@miczlo I would recommend considering motion
as your retain mode.
record:
enabled: True
expire_interval: 60
retain:
days: 14
mode: motion
This should significantly reduce the amount of work the record process needs to perform and reduce the chance of missing segments. If Frigate detects a single frame within the 10 second segment that has even a small amount of change in pixel values, the entire 10 seconds will be stored. With all
mode, you are storing a bunch of video files that are essentially just a static image.
@blakeblackshear I wasn't sure how much change constitutes motion and I'd rather have more than miss the important bit, but I might give it a go.
Can you try turning on debug logging for the record process?
logger: logs: frigate.record: debug
You should see messages about the time it takes to copy each segment.
I have a suspicion that my drive is dying. I did some testing yesterday one of which was to delete my recordings and my file manager claimed it would take several days. So I started fresh and wiped the drive, changed with slot on my back plane and no change. With a "clean" drive, fresh db and only 3 10s recordings it took abut 90secs to load the drive contents. (I'd say its dead 😁 ) I am going to swap out drives after work and report back.
Can you try turning on debug logging for the record process?
logger: logs: frigate.record: debug
You should see messages about the time it takes to copy each segment.
I have a suspicion that my drive is dying. I did some testing yesterday one of which was to delete my recordings and my file manager claimed it would take several days. So I started fresh and wiped the drive, changed with slot on my back plane and no change. With a "clean" drive, fresh db and only 3 10s recordings it took abut 90secs to load the drive contents. (I'd say its dead grin ) I am going to swap out drives after work and report back.
The 'good' news after a short testing period it appears the issue IS the drive. The bad news the replacement is about 1/4 the size (640GB from 2TB) and was a 'rescue' from an old DVR so likely wont last much longer itself. lol Thanks for the pointer @blakeblackshear
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Describe the problem you are having
I just noticed I have missing bits from my 24/7 recordings. It might be recording for 3 minutes and then 5 minutes are missing. This is on multiple cameras - seemingly same periods are missing (differences seem to be caused by different time set on cameras) 12:03 -> 12:07 12:08 -> 12:12 12:13 -> 12:16 There are no events at that time. Many errors in the logs down are caused by office_tapo, but this one is not even set to record anything Connection between Frigate and Cameras and NAS is hardwired. Only internal cameras are wireless (tapo included), but the problems seem to be with the other cameras. Now I'm not sure if I had this problem again. I spotted this with 2 RLC-810A but looking at others it seemed the data was there. I started troubleshooting and moved to v0.12 and that's the result for now.
I also noticed that the Front Porch Dahua didn't record anything at all after 12
Version
0.12.0-CF2466C
Frigate config file
Relevant log output
FFprobe output from your camera
Frigate stats
Operating system
HassOS
Install method
HassOS Addon
Coral version
USB
Network connection
Wired
Camera make and model
Reolink 810A, Annke NC-400, Dahua
Any other information that may be helpful
No response