blakeblackshear / frigate

NVR with realtime local object detection for IP cameras
https://frigate.video
MIT License
17.99k stars 1.64k forks source link

Save snapshot from different stream than detection stream. #2199

Open rsteckler opened 2 years ago

rsteckler commented 2 years ago

Describe what you are trying to accomplish and why in non technical terms I want to be able to run detection on a lower, more efficient stream, but then use a higher resolution stream to pass the MQTT image and for saving the snapshot. This will let me stay efficient on detection, but get high-res screens via MQTT to other apps like HA, DoubleTake, and custom scripts.

Describe the solution you'd like A "snapshots" role in the camera section would allow separation between the detection stream (which currently is used for snapshots) and the stream used to capture snapshots. Further, an "mqtt_snapshot" role could be specified for the images passed over MQTT on detection events.

Describe alternatives you've considered Running the detection on the high res stream, but that's inefficient with many cameras.

blakeblackshear commented 2 years ago

This will require decoding the high resolution stream. The only alternative would be to try and decode a single frame from the high resolution stream on demand. This would be possible, but there is no way to ensure the frame selected from the high resolution stream would match the low resolution stream. It might be a few seconds later.

rsteckler commented 2 years ago

That makes sense. Doing 24/7 deciding of all the high res streams doesn't sound like a great idea.

Is it possible to use whatever method you're already using for motion recording precapture to start deciding the high resolution stream 5 second before the event, then grab the correct frame?

blakeblackshear commented 2 years ago

It's not. Motion detection requires decoding the video stream too.

rsteckler commented 2 years ago

Right - I get that motion detection requires decoding the stream. But my understanding is that Frigate can do motion detection on the low res stream, then save a recording of the high res stream when motion is detected. And that the recording can specify a "pre-recording" duration (that defaults to 5 seconds).

So it sounds like Frigate is already somehow keeping a buffered stream of the high-res video so that it can somehow "go back in time" to handle the pre-recording?

blakeblackshear commented 2 years ago

The best I could do is go back and grab a frame from recording segments that is approximately the same time as the low resolution frame. With different frame rates and different resolutions, it is almost guaranteed to be different from the original frame from the low resolution stream.

rsteckler commented 2 years ago

Makes sense. I'll go ahead and document my use case clearly, then please feel free to close as "decline to add". No hard feelings here as I don't have time to submit a PR. For those who come later:

Right now, I'm doing detection with Frigate on the high resolution stream of the video because it's that stream that's used to store snapshots and publish to the MQTT bridge and I want those snapshots to be high resolution for both face recognition and seeing high-res images in email/home assistant. Detecting on multiple 4MP streams is expensive, so ideally I could detect on the lower res substream, but still get those high-res images for snapshots. This would obviously require frigate to decode the low-res stream for detection, and ALSO decode the high-res stream for snapshots. This leads to two options: 1) Always be decoding the high-res stream in case a snapshot is needed. This seems wasteful, but it probably less expensive that doing actual detection on the high-res stream (although detection happens on Coral and the decoding would happen on the CPU (or QuickSync), so it's not apples-to-apples. 2) Detect on the low-res stream, then start decoding the high-res stream to pull snapshots and latest jpgs when an object is detected. There will be latency as ffmpeg starts decoding the high-res stream and different FPS would lead to potentially different frames than those where detection actually happened. This is probably ok for the facial recognition use-case because latest.jpg would be sent in high-res multiple times and the timing doesn't need to be perfect. It's less ideal for the "best frame" snapshot, which may not match the actual best frame.

The best workaround I've found is by doing detection on the low-res stream in frigate, then picking up the MQTT /events message with my own script. This turns around and grabs a native screenshot from the camera (For Dahua cameras, this is http://user:pass@ip_address/cgi-bin/snapshot.cgi?1 ) This is not the perfect frame timing, but turns around quickly enough to be a patch solution that works well enough for my use cases.

blakeblackshear commented 2 years ago

If this were to be implemented, it would be option 2. I would think of it as creating an endpoint that allows a single frame grab from the record stream. I can't guarantee that it would be the right frame, but it would be in the ballpark.

rsteckler commented 2 years ago

Closing to keep your issues list svelte

lesourcil commented 2 years ago

Was option 2 implemented ?

NickM-27 commented 2 years ago

Was option 2 implemented ?

Not yet

messnerdev commented 2 years ago

I use a single 4K stream for detect and record, but I set my detect resolution down to 720P. In this case does frigate have easy access to a decoded 4K frame for snapshot images? Or is resizing somehow done before decoding?

NickM-27 commented 2 years ago

I use a single 4K stream for detect and record, but I set my detect resolution down to 720P. In this case does frigate have easy access to a decoded 4K frame for snapshot images? Or is resizing somehow done before decoding?

It will resize the stream for the detection process so the detection logic and snapshots will be done on the 720p steam.

bagobones commented 1 year ago

To add my two cents to this thread as I was disappointed with the snapshots one of the key things that bothers me is that on some cameras the aspect ratio is not even the same as the primary feed. This causes some odd behaviours if you have home assistant widgets switching between the last still from the LQ feed 4:3 and then back to the live HQ feed 16:9.

For bounding boxes and overlays I can totally see this as a major processing issue, however I would really like a best effort HQ clean option for several use cases where I think it would be the best option and not that sensitive to not matching the frames.

  1. HA camera object still frames.. By default HA it is always showing a delayed still, and the frigate HACS widget also has a still while loading option. In both cases things get strange when switching aspect ratios.
  2. Doorbell snapshots.. I used to grab the latest still from the camera on button press. Someone is ALWAYS reaching out to the camera in that case, I would rather have the HQ version than the POTATO quality detection stream.
  3. snapshots for events already offer a clean and unclean option.. Having a 3rd HQ clean option would be nice for notifications.. For cameras with wider detection areas it is unlikely the detected object will get out of frame if the delay is less than say a second and only a few frames off.
NickM-27 commented 1 year ago

Still a valid issue

ndbroadbent commented 1 year ago

This would be really nice. I set up the Double Take add-on and CompreFace to run facial recognition on my frigate snapshot images. The quality is really poor and not really good enough for this, so I would like to image snapshots from the high quality stream.

The best I could do is go back and grab a frame from recording segments that is approximately the same time as the low resolution frame. With different frame rates and different resolutions, it is almost guaranteed to be different from the original frame from the low resolution stream.

I think this could be solved by adding a calibration option for +/- x milliseconds. This should be pretty consistent, and not too hard to figure out manually.

eddyg commented 1 year ago

I searched (and found this issue) because I would also like to see snapshot.jpg be from the "record" stream (thumbnail.jpg could remain from the "detect" stream?) so consider this a +1.

To add my two cents to this thread as I was disappointed with the snapshots one of the key things that bothers me is that on some cameras the aspect ratio is not even the same as the primary feed. This causes some odd behaviours if you have home assistant widgets switching between the last still from the LQ feed 4:3 and then back to the live HQ feed 16:9.

FWIW, I worked around the mismatched aspect ratio for my cameras by using ffmpeg to scale the low-res detection stream to 16:9. This also seemed to improve accuracy, since the video is no longer being vertically stretched. (If there's a better way to do this, I'd be happy to update my config!)

  driveway:
    ffmpeg:
      inputs:
        - path: rtsp://front-cam/cam/realmonitor?channel=1&subtype=1
          roles:
            - detect
            - rtmp
        - path: rtsp://front-cam/cam/realmonitor?channel=1&subtype=0
          roles:
            - record
      output_args:
        detect: '-vf scale=704:396 -f rawvideo -pix_fmt yuv420p'
        rtmp: '-aspect 704:396 -c copy -f flv'
    detect:
      width: 704
      height: 396
      fps: 7
      stationary:
        interval: 70
asaworld commented 1 year ago

I would like to see this too. The slight variation in images does not matter to me either. I would like to feed it into face and numberplate recognition. It would be nice to have an endpoint to grab a full resolution snapshot from. At the moment my choice is rtmp which is processed and scaled or the snapshot or the camera directly. Am I missing any other options?

lordratner commented 1 year ago

Chiming in to add my support. High resolution snapshots are almost more important than the recordings, depending on the use case. For automations, as an example. Would definitely like to see this improved.

toddstar commented 1 year ago

Another +1 support for saving the image from higher res feed.

As a workout for anyone looking for this now, who's camera doesn't have a still image url so can't use that method, my dirty workaround has been to create duplicate cameras that use the main stream for detected but has detect turned off by default & then created some automations so that when the original camera stream finds an object (all_count goes above 1) it turns on detected on the duplicate camera and turns it back off when count is 0. Does create some extra work for your hardware but at least gets an image output that's big enough to get a facial match for double take

NickM-27 commented 1 year ago

Another +1 support for saving the image from higher res feed.

As a workout for anyone looking for this now, who's camera doesn't have a still image url so can't use that method, my dirty workaround has been to create duplicate cameras that use the main stream for detected but has detect turned off by default & then created some automations so that when the original camera stream finds an object (all_count goes above 1) it turns on detected on the duplicate camera and turns it back off when count is 0. Does create some extra work for your hardware but at least gets an image output that's big enough to get a facial match for double take

If you're going to do all of that work, you might as well just use a higher resolution stream for detect. Whatever resolution you set for

detect:
  width:
  height:

will resize the detect stream to that. So for example my doorbell camera I just have my main stream (2560x1920) and set it to quarter size which is plenty for doubletake while also not being too much work

detect:
  width: 1280
  height: 960
toddstar commented 1 year ago

Another +1 support for saving the image from higher res feed. As a workout for anyone looking for this now, who's camera doesn't have a still image url so can't use that method, my dirty workaround has been to create duplicate cameras that use the main stream for detected but has detect turned off by default & then created some automations so that when the original camera stream finds an object (all_count goes above 1) it turns on detected on the duplicate camera and turns it back off when count is 0. Does create some extra work for your hardware but at least gets an image output that's big enough to get a facial match for double take

If you're going to do all of that work, you might as well just use a higher resolution stream for detect. Whatever resolution you set for

detect:
  width:
  height:

will resize the detect stream to that. So for example my doorbell camera I just have my main stream (2560x1920) and set it to quarter size which is plenty for doubletake while also not being too much work

detect:
  width: 1280
  height: 960

Probably not the best example/wording as the camera I've been testing on just has person object detection rather than a group of objects, but the duplicate camera setup allows you to pick and chose what object type (really shouldn't have used all_count lol) you reprocess the better stream to get a better image. So for example doorbell cam could process vehicles, bikes, animals, people, etc. all at say 640 / 480 and the automation to turn on detection on the higher res duplicate camera is only triggered by person going above 0

Maybe I'm over thinking it and should just detect everything on a higher res but I would have thought its much easier on your hardware to detect on a really low res image by default and then call for the higher res as and when required

NickM-27 commented 1 year ago

Maybe I'm over thinking it and should just detect everything on a higher res but I would have thought its much easier on your hardware to detect on a really low res image by default and then call for the higher res as and when required

In general yes it is more than without but then again just works without any complications and depends on your hardware, but with Frigate 0.12 (currently in beta) using the hwaccel presets will use the GPU to scale the stream instead of the CPU so even with this setup my CPU use is only 3% (process usage for a single core) for each camera (used to be 50% process usage for a single core)

simondsmason commented 1 year ago

So is the preferred solution to perform detection on the main higher resolution stream and reduce the detect size? Does the reduction in detect size proportionally reduce the load on the system? Thanks

NickM-27 commented 1 year ago

So is the preferred solution to perform detection on the main higher resolution stream and reduce the detect size? Does the reduction in detect size proportionally reduce the load on the system? Thanks

Yes that would be the preferred solution. In 0.12 if you use hwaccel presets Frigate will use the GPU to do the reduction in size which further reduces the CPU impact.

DrSpaldo commented 1 year ago

Looks like a fairly popular and wanted feature request. Hopefully they team get a chance to come up with some type of solution for the situation. Even the grabbing still shot on demand would be a good start?

guim31 commented 1 year ago

+1 here as I juste installed DoubleTake and the snapshot resolution is really low . I would love a solution where the snapshots are bigger and better quality !

Thanks to every developper here for their work ;)

bagobones commented 1 year ago

If you are on beta/rc1 0.12 you can try just using the main feed for detect and see how well hardware acceleration deals with re-sizing. May not be a good idea for 4K cameras but for 1080p I am not seeing too much overhead on a 7th Gen i7 CPU

infernix commented 1 year ago

Can someone clarify this for me, because it isn't that clear from the conversation.

Using a high-res (2688x1520) as your only stream, and setting detect to resize this to e.g. 1280x720 will send 720p frames to the detection process, and snapshots will also be in 720p. If this is the case, where then does the higher res stream get used (apart from record)?

In other words, what's the actual difference between one high-res stream that is used for record and detect with detect resizing it to 720p versus using a main high-res stream off the camera for record and a 720p substream off the camera for the detect stream?

Because at this stage it seems there's no actual difference apart from the resizing overhead, given that the detect stream is used for snapshots as well?

NickM-27 commented 1 year ago

Because at this stage it seems there's no actual difference apart from the resizing overhead, given that the detect stream is used for snapshots as well?

The difference is that the vast majority of users use cameras that only have a high res main stream and then a 640x480 substream. So the only way to have detect use a better resolution is the solution that has been described above.

thewan056 commented 1 year ago

I would also like to +1 the above suggestion, I believe it is a good compromise.

One use case here is not mentioned which I have encountered. I would also like higher res higher quality snapshots, but continue to use detect on a lower res lower quality stream. My cameras use h265/hevc for the main stream and h264 for the sub stream. I am using an old repurposed pc with no hardware accelerated h265/hevc, although it is powerful enough to do it on its CPU. It would be nice if I could keep using this setup a while longer until it could no longer be used, reducing ewaste in the process, before I consider an upgrade that has hardware accelerated decoding for newer codecs.

Assuming that you only need to start CPU decoding when the event starts and it ends when the process of grabbing the screenshot is complete, it might be more power efficient to do so then to use have a high res h265/hevc detect stream running on the CPU and using up a lot of power and generating a lot of heat all the time.

edit: I would like to add that yes I can change the main stream to h264 but I have found that keeping h265/hevc helps with reducing my use of storage space and bandwidth so I would like to keep using h265/hevc for my record stream.

Someguitarist commented 1 year ago

For what it's worth, plus one more. I've read through the conversation here, and it's worth mentioning that it's okay if it's a few seconds off, just so long as it's a clear enough image!

Jensilein commented 1 year ago

I would also very much appreciate having high-res snapshots available. That would be really great and another big improvement to the very good frigate.

NickM-27 commented 1 year ago

To be clear, you can already use go2rtc to retrieve a high res snapshot. It just won't be saved as part of the event

CoMPaTech commented 1 year ago

To be clear, you can already use go2rtc to retrieve a high res snapshot. It just won't be saved as part of the event

Sounds like something to tinker with to insert it along on a different mqtt topic? Or would that by DIY-stretching it?

NickM-27 commented 1 year ago

I don't see the reason for that versus just having a user use the go2rtc api directly

CoMPaTech commented 1 year ago

I don't see the reason for that versus just having a user use the go2rtc api directly

I would (ab)use that for double take or HA or likewise situations

NickM-27 commented 1 year ago

You can already configure doubletake to pull from a different url and use this

DrSpaldo commented 1 year ago

I don't see the reason for that versus just having a user use the go2rtc api directly

So, would you recommend something along the lines of; we set up an automation in Home Assistant to trigger grabbing the screenshot from go2rtc that’s running within Frigate?

Jensilein commented 1 year ago

To be clear, you can already use go2rtc to retrieve a high res snapshot. It just won't be saved as part of the event

How can I best do that? Is there a description I could use? So far I was struggling to get go2rtc working with my Reolink RLC-810A cameras. Have used the frigate documentation but was not yet able to make it. Btw, my cameras can take high res snapshots via http. In a browser it works perfectly. My assumption was that it should also be possible to use in frigate, or am I wrong? Thanks a lot.

NickM-27 commented 1 year ago

My assumption was that it should also be possible to use in frigate, or am I wrong? Thanks a lot.

No, otherwise this issue would be closed. If your camera offers a high res snapshot then no reason to use go2rtc for that

Someguitarist commented 1 year ago

Thanks for all your quick responses, Nick! Just fyi, and this might be a 'me' problem or something I have to work out, but I tried to use go2rtc but it appeared to be way to resource-intensive for my little i3-Nuc. If I enable go2rtc it ends up locking up with 100% cpu usage after an hour or so.

If you don't mind me asking, is go2rtc particularly heavy, or is there something in my configuration I need to work out? Don't want to derail the discussion here, but I'm trying it in order to get a hi-res snapshot for doubletake, so it felt appropriate to the topic.

NickM-27 commented 1 year ago

Thanks for all your quick responses, Nick! Just fyi, and this might be a 'me' problem or something I have to work out, but I tried to use go2rtc but it appeared to be way to resource-intensive for my little i3-Nuc. If I enable go2rtc it ends up locking up with 100% cpu usage after an hour or so.

If you don't mind me asking, is go2rtc particularly heavy, or is there something in my configuration I need to work out? Don't want to derail the discussion here, but I'm trying it in order to get a hi-res snapshot for doubletake, so it felt appropriate to the topic.

The current version in Frigate 0.12 is known to have issues with some types of cameras. https://docs.frigate.video/configuration/advanced#custom-go2rtc-version can be used to download the most recent version which has not had any issues for me. It also depends on your config which can have things that increase the load for no reason.

Someguitarist commented 1 year ago

Okay, I'll give that a shot when I get home. Once I get that running without issue, how do I point doubletake to use go2rtc instead of frigate? Really, my only goal is to get a single hi-res mqtt or snapshot out of it!

Thanks again for your help!

dopeytree commented 12 months ago

Did anyone get a solution working? Even just a manual simple button to take a snapshot from the high res recorded video would be nice & a useful feature. But I'm sure it can be automated...

NickM-27 commented 12 months ago

There are multiple solutions discussed above. I currently just use go2rtc to pull a snapshot from the cameras onvif snapshot endpoint

Someguitarist commented 12 months ago

I just wanted to share that none of the solutions provided above worked for me. Using go2rtc causes 100% cpu usage and a server crash in about ~2-3 minutes. It may be something particular to my setup, as it's an old 7th gen i3 processor, but everything with frigate works great until I enable go2rtc. I've tried the latest dev version of Frigate, and a different version of go2rtc inside of the container, with the same results.

It's unfortunate, but I haven't been able to find a way to get a hi res snapshot with any of the solutions above. That being said though, I do want to stress that Frigate is working great for everything else! Thanks for the software!

NickM-27 commented 12 months ago

I just wanted to share that none of the solutions provided above worked for me. Using go2rtc causes 100% cpu usage and a server crash in about ~2-3 minutes. It may be something particular to my setup, as it's an old 7th gen i3 processor, but everything with frigate works great until I enable go2rtc. I've tried the latest dev version of Frigate, and a different version of go2rtc inside of the container, with the same results.

Sounds like something is being setup incorrectly, especially if you use the onvif integration in go2rtc to pull a snapshot the cpu usage is minimal because there's no transcoding or anything

DrSpaldo commented 12 months ago

What command do you use to grab it?

I had a look at this page from the go2rtc wiki - https://github.com/AlexxIT/go2rtc/wiki/Snapshot-to-Telegram but it doesn't look like it works for Frigate using those details

NickM-27 commented 12 months ago
http://frigate_ip:1984/api/frame.jpeg?src=front_cam_snapshot

And onvif stream was just retrieved from the go2rtc dashboard and added to a separate camera.

    front_cam_snapshot:
      - onvif://{FRIGATE_RTSP_USER}:{FRIGATE_FRONT_PW}@IP?subtype=MediaProfile00000&snapshot
DrSpaldo commented 12 months ago
http://frigate_ip:1984/api/frame.jpeg?src=front_cam_snapshot

And onvif stream was just retrieved from the go2rtc dashboard and added to a separate camera.

    front_cam_snapshot:
      - onvif://{FRIGATE_RTSP_USER}:{FRIGATE_FRONT_PW}@IP?subtype=MediaProfile00000&snapshot

Thanks @NickM-27 . Turns out mine wasn't working because I didn't have port 1984 set in my docker compose. Once I added it, then it started working. The _snapshot bit didn't work for me but this link did:

http://192.168.1.22:1984/api/frame.jpeg?src=frontdoor