blakeblackshear / frigate

NVR with realtime local object detection for IP cameras
https://frigate.video
MIT License
19.59k stars 1.8k forks source link

[Support]: OpenVINO Does not work with YOLOX model #5184

Closed aeozyalcin closed 1 year ago

aeozyalcin commented 1 year ago

Describe the problem you are having

Awesome job getting OpenVINO into Frigate @NateMeyer! I just got a thin client with a J5005, and OpenVINO support was perfectly timed!

I tried using the yolox-tiny model with OpenVINO, and got the following error:

2023-01-21 20:32:53.946348978  [2023-01-21 12:32:53] detector.ov                    INFO    : Starting detection process: 376
2023-01-21 20:32:55.017048607  [2023-01-21 12:32:55] frigate.detectors.plugins.openvino INFO    : Model Input Shape: {1, 3, 416, 416}
2023-01-21 20:32:55.017056288  [2023-01-21 12:32:55] frigate.detectors.plugins.openvino INFO    : Model Output-0 Shape: {1, 3549, 85}
2023-01-21 20:32:55.017058803  [2023-01-21 12:32:55] frigate.detectors.plugins.openvino INFO    : Model has 1 Output Tensors
2023-01-21 20:32:55.019078340  Process detector:ov:
2023-01-21 20:32:55.042640498  Traceback (most recent call last):
2023-01-21 20:32:55.042646939    File "/usr/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
2023-01-21 20:32:55.042648687      self.run()
2023-01-21 20:32:55.042650805    File "/usr/lib/python3.9/multiprocessing/process.py", line 108, in run
2023-01-21 20:32:55.042652584      self._target(*self._args, **self._kwargs)
2023-01-21 20:32:55.042654630    File "/opt/frigate/frigate/object_detection.py", line 120, in run_detector
2023-01-21 20:32:55.042656489      detections = object_detector.detect_raw(input_frame)
2023-01-21 20:32:55.042658500    File "/opt/frigate/frigate/object_detection.py", line 71, in detect_raw
2023-01-21 20:32:55.042660363      return self.detect_api.detect_raw(tensor_input=tensor_input)
2023-01-21 20:32:55.042662445    File "/opt/frigate/frigate/detectors/plugins/openvino.py", line 45, in detect_raw
2023-01-21 20:32:55.042664053      infer_request.infer([tensor_input])
2023-01-21 20:32:55.042666257    File "/usr/local/lib/python3.9/dist-packages/openvino/runtime/ie_api.py", line 151, in infer
2023-01-21 20:32:55.042667807      return super().infer(
2023-01-21 20:32:55.042670418  RuntimeError: Can't set input blob with name: images, because model input (shape={1,3,416,416}) and blob (shape=(1.416.416.3)) are incompatible

Looks like there is an incompatibility between how Frigate sends frames to the yolo model, and what the model expects as its input. When I use the included ssdlite_mobilenet_v2 model, object detection is working as expected.

P.S.: I also tried Intel's own person-detection-0201 model. Frigate ran without errors, but no detections were showing up. I don't think that's related, but just a data point.

Version

0.12.0-12D51D3

Frigate config file

mqtt:
  enabled: False

detectors:
  ov:
    type: openvino
    device: GPU
    model:
      path: /media/frigate/openvino-model/yolox-tiny.xml   

model:
  width: 416
  height: 416
  input_tensor: nhwc
  input_pixel_format: bgr
  labelmap_path: /media/frigate/openvino-model/coco_80cl.txt 

cameras:
  front_driveway:
    objects:
      track:
        - person
    ffmpeg:
      inputs:
        - path: http://192.168.0.241/flv?port=1935&app=bcs&stream=channel0_ext.bcs&user=admin&password=password
          roles:
            - detect
            - restream
          hwaccel_args: preset-intel-qsv-h264
          input_args:
            - -strict
            - experimental
            - -analyzeduration
            - 1000M
            - -probesize
            - 1000M
            - -rw_timeout
            - "5000000"

    detect:
      enabled: True
      width: 896
      height: 512
      fps: 10

    zones:

      lower_driveway:
        coordinates: 480,191,486,81,320,92,191,136,185,178,0,242,0,377
        objects:
          - person
        filters:
          person:
            min_area: 1000
            max_area: 10000
      middle_driveway:
        coordinates: 83,400,704,326,784,228,798,92,478,69,474,182,0,372
        objects:
          - person
        filters:
          person:
            min_area: 2000
            max_area: 20000          
      upper_driveway:
        coordinates: 896,512,896,123,783,89,776,227,703,326,83,400,0,372,0,512
        objects:
          - person
        filters:
          person:
            min_area: 4500
            max_area: 100000   

    motion:
      mask:
        - 201,113,398,36,896,38,896,36,896,0,0,0,0,165
      improve_contrast: True
      contour_area: 25

    snapshots:
      # Optional: Enable writing jpg snapshot to /media/frigate/clips (default: shown below)
      # This value can be set via MQTT and will be updated in startup based on retained value
      enabled: True
      # Optional: save a clean PNG copy of the snapshot image (default: shown below)
      clean_copy: True
      # Optional: print a timestamp on the snapshots (default: shown below)
      timestamp: False
      # Optional: draw bounding box on the snapshots (default: shown below)
      bounding_box: True
      # Optional: crop the snapshot (default: shown below)
      crop: False
      required_zones:
        - upper_driveway
        - middle_driveway
        - lower_driveway
      # Optional: height to resize the snapshot to (default: original size)
      # height: 175
      # Optional: Restrict snapshots to objects that entered any of the listed zones (default: no required zones)
        # - driveway_entrance
      # Optional: Camera override for retention settings (default: global values)
      retain:
        # Required: Default retention days (default: shown below)
        default: 2
        # objects:
        #   car: 7
    record:
      # Optional: Enable recording (default: global setting)
      enabled: True
      # Optional: Number of days to retain (default: global setting)
      retain:
        days: 1
      events:
        objects:
          - person
        required_zones:
          - upper_driveway
          - middle_driveway
          - lower_driveway
        retain:
          default: 2

Relevant log output

2023-01-21 20:32:53.946348978  [2023-01-21 12:32:53] detector.ov                    INFO    : Starting detection process: 376
2023-01-21 20:32:55.017048607  [2023-01-21 12:32:55] frigate.detectors.plugins.openvino INFO    : Model Input Shape: {1, 3, 416, 416}
2023-01-21 20:32:55.017056288  [2023-01-21 12:32:55] frigate.detectors.plugins.openvino INFO    : Model Output-0 Shape: {1, 3549, 85}
2023-01-21 20:32:55.017058803  [2023-01-21 12:32:55] frigate.detectors.plugins.openvino INFO    : Model has 1 Output Tensors
2023-01-21 20:32:55.019078340  Process detector:ov:
2023-01-21 20:32:55.042640498  Traceback (most recent call last):
2023-01-21 20:32:55.042646939    File "/usr/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
2023-01-21 20:32:55.042648687      self.run()
2023-01-21 20:32:55.042650805    File "/usr/lib/python3.9/multiprocessing/process.py", line 108, in run
2023-01-21 20:32:55.042652584      self._target(*self._args, **self._kwargs)
2023-01-21 20:32:55.042654630    File "/opt/frigate/frigate/object_detection.py", line 120, in run_detector
2023-01-21 20:32:55.042656489      detections = object_detector.detect_raw(input_frame)
2023-01-21 20:32:55.042658500    File "/opt/frigate/frigate/object_detection.py", line 71, in detect_raw
2023-01-21 20:32:55.042660363      return self.detect_api.detect_raw(tensor_input=tensor_input)
2023-01-21 20:32:55.042662445    File "/opt/frigate/frigate/detectors/plugins/openvino.py", line 45, in detect_raw
2023-01-21 20:32:55.042664053      infer_request.infer([tensor_input])
2023-01-21 20:32:55.042666257    File "/usr/local/lib/python3.9/dist-packages/openvino/runtime/ie_api.py", line 151, in infer
2023-01-21 20:32:55.042667807      return super().infer(
2023-01-21 20:32:55.042670418  RuntimeError: Can't set input blob with name: images, because model input (shape={1,3,416,416}) and blob (shape=(1.416.416.3)) are incompatible

FFprobe output from your camera

"[\n  {\n    \"return_code\": 0,\n    \"stderr\": {},\n    \"stdout\": {\n      \"programs\": [],\n      \"streams\": [\n        {\n          \"avg_frame_rate\": \"19/1\",\n          \"codec_long_name\": \"H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10\",\n          \"height\": 512,\n          \"width\": 896\n        },\n        {\n          \"avg_frame_rate\": \"0/0\",\n          \"codec_long_name\": \"AAC (Advanced Audio Coding)\"\n        }\n      ]\n    }\n  }\n]"

Frigate stats

No response

Operating system

Other Linux

Install method

HassOS Addon

Coral version

Other

Network connection

Wired

Camera make and model

Reolink RLC-810A

Any other information that may be helpful

No response

NateMeyer commented 1 year ago

The model input can be fixed by seeing input_tensor: nchw, but the OpenVino detector expects the output tensor to be an SSD format and will error out with the YoloX output shape.

aeozyalcin commented 1 year ago

Awesome, thank you for the reply! Are there any plans to add support for formats other than SSD?

Side question, any idea why person-detection-0201 would not work, since that's also an SSD based model?

NateMeyer commented 1 year ago

I notice the person detect model has a label always of 0 for person. Try passing in a new label map with just one line:

person
aeozyalcin commented 1 year ago

Ok I have some updates. It turns out that the person-detection-0201 model also requires nchw format. Applying this change made that model work!

As for yolox, your suggestion to change to nchw got us further. As expected, the detection process now fails due to the yolo output format not matching the expected SSD output format.

This is now the error:

2023-01-22 17:54:46.406977407  [2023-01-22 09:54:46] frigate.detectors.plugins.openvino INFO    : Model Input Shape: {1, 3, 416, 416}
2023-01-22 17:54:46.407103750  [2023-01-22 09:54:46] frigate.detectors.plugins.openvino INFO    : Model Output-0 Shape: {1, 3549, 85}
2023-01-22 17:54:46.407195458  [2023-01-22 09:54:46] frigate.detectors.plugins.openvino INFO    : Model has 1 Output Tensors
2023-01-22 17:54:46.467231402  Process detector:ov:
2023-01-22 17:54:46.468384337  Traceback (most recent call last):
2023-01-22 17:54:46.468405707    File "/usr/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
2023-01-22 17:54:46.468407912      self.run()
2023-01-22 17:54:46.468409626    File "/usr/lib/python3.9/multiprocessing/process.py", line 108, in run
2023-01-22 17:54:46.468411126      self._target(*self._args, **self._kwargs)
2023-01-22 17:54:46.468412847    File "/opt/frigate/frigate/object_detection.py", line 120, in run_detector
2023-01-22 17:54:46.468414375      detections = object_detector.detect_raw(input_frame)
2023-01-22 17:54:46.468428917    File "/opt/frigate/frigate/object_detection.py", line 71, in detect_raw
2023-01-22 17:54:46.468431362      return self.detect_api.detect_raw(tensor_input=tensor_input)
2023-01-22 17:54:46.468433102    File "/opt/frigate/frigate/detectors/plugins/openvino.py", line 52, in detect_raw
2023-01-22 17:54:46.468434371      if object_detected[0] != -1:
2023-01-22 17:54:46.468449474  IndexError: invalid index to scalar variable.
2023-01-22 17:54:47.976514363  [2023-01-22 09:54:47] frigate.object_processing      INFO    : Exiting object processor...

Are there any plans to add ability to pass model type to the OV detector in Frigate, similar to how OpenVINO object_detection.py demo does?

-at, --architecture_type  Required. Specify model' architecture type. Valid values are {centernet,detr,ctpn,faceboxes,nanodet,nanodet-plus,retinaface,retinaface-pytorch,ssd,ultra_lightweight_face_detection,yolo,yolov4,yolof,yolox,yolov3-onnx}.
NateMeyer commented 1 year ago

I'm glad to hear you got the person-detect model working.

I have no plans right now to work on adding other model types to the detector, but that is certainly something that could done.

aeozyalcin commented 1 year ago

I modified openvino.py in the plugins folder to accommodate a yolo model, but I am unfortunately a noob when it comes to deploying/building docker containers. I committed my changes to the frigate image and saved it as a local docker image, but was unable to run it. I get this error, would love some pointers:

sudo docker run -d   --name frigate-yolo   --restart=unless-stopped   --mount type=tmpfs,target=/tmp/cache,tmpfs-size=1000000000   --device /dev/dri/renderD128   --shm-size=64m   -v /usr/share/hassio/media:/media   -v /usr/share/hassio/homeassistant:/config   -v /etc/localtime:/etc/localtime:ro   -e CONFIG_FILE='/config/frigate.yml'   -p 5000:5000   -p 8554:8554   -p 8555:8555/tcp   -p 8555:8555/udp frigate_yolo
8b0f60aa2621ca6c86f3901c0c44b5263ee33158f4241a78bb8dfb84ea15bc9b
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: unable to setup user: stat /dev/null: no such file or directory: unknown.
alez007 commented 1 year ago

@aeozyalcin mind sharing your config for person-detection-0201 ? I can't get it to work myself

NateMeyer commented 1 year ago

@aeozyalcin whenever I got an error like that, it was because I already had a frigate container that wasn't removed.

The docker image is fairly painless to build. Fork/clone the dev branch of the repo and run make local to generate a frigate:latest image you can use on your machine.

aeozyalcin commented 1 year ago

I have successfully implemented the yolox_tiny model! It was a bit of a PIA but it works haha. I would love to make a PR to check this in, but have never done so on Github before. Any tips appreciated.

@alez007 like Nate said, all I had to do was to change the input_tensor type to nchw. You will also need to change the image size to 384x384. It's 300x300 by default for the included ssd model.

Caros2017 commented 1 year ago

I have successfully implemented the yolox_tiny model! It was a bit of a PIA but it works haha. I would love to make a PR to check this in, but have never done so on Github before. Any tips appreciated.

@alez007 like Nate said, all I had to do was to change the input_tensor type to nchw. You will also need to change the image size to 384x384. It's 300x300 by default for the included ssd model.

Can you share your experience between the 'default' ssdlite_mobilenet_v2 model versus person-detection-0201? If you use only person detection, is this model then better? And especially in dark environments with IR enabled?

aeozyalcin commented 1 year ago

I have successfully implemented the yolox_tiny model! It was a bit of a PIA but it works haha. I would love to make a PR to check this in, but have never done so on Github before. Any tips appreciated.

@alez007 like Nate said, all I had to do was to change the input_tensor type to nchw. You will also need to change the image size to 384x384. It's 300x300 by default for the included ssd model.

Can you share your experience between the 'default' ssdlite_mobilenet_v2 model versus person-detection-0201? If you use only person detection, is this model then better? And especially in dark environments with IR enabled?

I am actually positively surprised at how well the ssdlite_mobilenet_v2 model is performing with OpenVINO. I am running it on the Intel iGPU, and coming from a Pi+Coral setup, it definitely performs better in terms of accuracy.

Regarding the person-detection-0201 model, I briefly ran it, and found that it was a bit 'trigger happy'. It was falsely picking up non person subjects. I'm sure it could be turned out, but I didn't bother.

I'm currently running YOLOx-tiny inside Frigate as of last night. I'll report back on how it performs. It should be better than ssdlite_mobilenet_v2 in theory based on the mAP.

aeozyalcin commented 1 year ago

For those interested, I am working on a PR to integrate yolox support into Frigate. If anyone wants to try it, give it a go. It's 99% there. https://github.com/aeozyalcin/frigate/tree/openvino_yolox

aeozyalcin commented 1 year ago

https://github.com/blakeblackshear/frigate/pull/5285

Caros2017 commented 1 year ago

For those interested, I am working on a PR to integrate yolox support into Frigate. If anyone wants to try it, give it a go. It's 99% there. https://github.com/aeozyalcin/frigate/tree/openvino_yolox

Wow really nice. Appreciate the work. Wil test it once merged and released in Beta. Also wondering what your experience is between the models:) But that will of course take some time.

nickp27 commented 1 year ago

Any guide/details on what omz_downloader config you used for processing the model? I'm keen to use it to solve/test some of the false positives I was having with ssdlite.

aeozyalcin commented 1 year ago

https://github.com/blakeblackshear/frigate/pull/5285 All merged and good to go. Closing the issue. Now I am taking a crack at adding support for YOlOv5, but no promises.

aeozyalcin commented 1 year ago

Any guide/details on what omz_downloader config you used for processing the model? I'm keen to use it to solve/test some of the false positives I was having with ssdlite.

Try these: omz_downloader --name yolox_tiny omz_converter --name yolox_tiny

Caros2017 commented 1 year ago

Very nice work @aeozyalcin. Looking forward to the new beta to try this.

How are your experiences so far between yolox_tiny and ssdlite_mobilenet_v2 ?

aeozyalcin commented 1 year ago

Very nice work @aeozyalcin. Looking forward to the new beta to try this.

How are your experiences so far between yolox_tiny and ssdlite_mobilenet_v2 ?

I am liking YOLOx so far! Because the model input is 416*416, it's able to detect smaller objects. But of course that means that it can also have more false positives depending on the scene.

I have an issue open on the YOLOx repo to see if anyone will help me with transfer learning/fine tuning the model to make it even more effective in my application.

nickp27 commented 1 year ago

Very nice work @aeozyalcin. Looking forward to the new beta to try this. How are your experiences so far between yolox_tiny and ssdlite_mobilenet_v2 ?

I am liking YOLOx so far! Because the model input is 416*416, it's able to detect smaller objects. But of course that means that it can also have more false positives depending on the scene.

I have an issue open on the YOLOx repo to see if anyone will help me with transfer learning/fine tuning the model to make it even more effective in my application.

Doesn't the bigger sample size make it harder to detect objects? I have found YoloX to be more reliable at picking up my dog, but it's been running for 2 days and hasn't picked up a person once (so on my 720 x 576 substream, my guess had been that 416 x 416 rather than mobilenet's 300 x 300 was the cause).

Caros2017 commented 1 year ago

Very nice work @aeozyalcin. Looking forward to the new beta to try this. How are your experiences so far between yolox_tiny and ssdlite_mobilenet_v2 ?

I am liking YOLOx so far! Because the model input is 416*416, it's able to detect smaller objects. But of course that means that it can also have more false positives depending on the scene. I have an issue open on the YOLOx repo to see if anyone will help me with transfer learning/fine tuning the model to make it even more effective in my application.

Doesn't the bigger sample size make it harder to detect objects? I have found YoloX to be more reliable at picking up my dog, but it's been running for 2 days and hasn't picked up a person once (so on my 720 x 576 substream, my guess had been that 416 x 416 rather than mobilenet's 300 x 300 was the cause).

I think that 416*416 is pretty large compared to the whole area in your substream. Have you tried to compare with a higher resolution in the substream or for the test, try the main stream?

aeozyalcin commented 1 year ago

Exactly. I am running a 1080p stream through the model. I found that Frigate will not "zoom in" the section of motion and upscale it to the model input dimensions. Instead, the minimum region it will send is the model input dimensions. For example, for a 720p stream with this 416x416 model, each input frame would be roughly 1/3 of the whole frame.

alez007 commented 1 year ago

First of all, thanks for the PR @aeozyalcin , good stuff 🦾

I've been running yolox_tiny for a couple of days now and comparing it with ssdlite (default one). Basically my setup is detect at 1080p 4/3 in both scenarios and what I noticed is that yolox_tiny is returning smaller percentages than ssdlite for objects in the same positions with pretty much the same luminosity and setup.

Can provide examples but in general if ssdlite thinks there's 90% chance to have a person, yolox_tiny is around 7-8% less than that plus it gives me more false positives.

Will continue testing but wondering if there is a logical explanation for these smaller detect percentages, taking into account that the model is bigger for yolox but the detect resolution remained the same in both scenarios. Would it be worth it to increase the detect resolution ?

alez007 commented 1 year ago

Any guide/details on what omz_downloader config you used for processing the model? I'm keen to use it to solve/test some of the false positives I was having with ssdlite.

Try these: omz_downloader --name yolox_tiny omz_converter --name yolox_tiny

@nickp27 a few more packages needed to be installed for me (slackware 15) for omz_converter to work, one of them being torch; just pip install all of them and it will work.

NickM-27 commented 1 year ago

First of all, thanks for the PR @aeozyalcin , good stuff 🦾

I've been running yolox_tiny for a couple of days now and comparing it with ssdlite (default one). Basically my setup is detect at 1080p 4/3 in both scenarios and what I noticed is that yolox_tiny is returning smaller percentages than ssdlite for objects in the same positions with pretty much the same luminosity and setup.

Can provide examples but in general if ssdlite thinks there's 90% chance to have a person, yolox_tiny is around 7-8% less than that plus it gives me more false positives.

Will continue testing but wondering if there is a logical explanation for these smaller detect percentages, taking into account that the model is bigger for yolox but the detect resolution remained the same in both scenarios. Would it be worth it to increase the detect resolution ?

The scores depend on how the model is trained, weights, etc and the actual size of the image sent to the model (I don't think) affects thst.

For ANY different model you will want to adjust the threshold and min_score values as frigate has those tuned just for the default coral model. That will greatly help in working with the new model to have less false positives

aeozyalcin commented 1 year ago

Any guide/details on what omz_downloader config you used for processing the model? I'm keen to use it to solve/test some of the false positives I was having with ssdlite.

Try these: omz_downloader --name yolox_tiny omz_converter --name yolox_tiny

@nickp27 a few more packages needed to be installed for me (slackware 15) for omz_converter to work, one of them being torch; just pip install all of them and it will work.

Installing openvino-dev should also install the omz* tools.

First of all, thanks for the PR @aeozyalcin , good stuff 🦾

I've been running yolox_tiny for a couple of days now and comparing it with ssdlite (default one). Basically my setup is detect at 1080p 4/3 in both scenarios and what I noticed is that yolox_tiny is returning smaller percentages than ssdlite for objects in the same positions with pretty much the same luminosity and setup.

Can provide examples but in general if ssdlite thinks there's 90% chance to have a person, yolox_tiny is around 7-8% less than that plus it gives me more false positives.

Will continue testing but wondering if there is a logical explanation for these smaller detect percentages, taking into account that the model is bigger for yolox but the detect resolution remained the same in both scenarios. Would it be worth it to increase the detect resolution ?

If you would like to share your config, I can take a look at it. I want to see the config of the camera in question, and how you have the openvino detector defined in your yaml.

alez007 commented 1 year ago

Any guide/details on what omz_downloader config you used for processing the model? I'm keen to use it to solve/test some of the false positives I was having with ssdlite.

Try these: omz_downloader --name yolox_tiny omz_converter --name yolox_tiny

@nickp27 a few more packages needed to be installed for me (slackware 15) for omz_converter to work, one of them being torch; just pip install all of them and it will work.

Installing openvino-dev should also install the omz* tools.

First of all, thanks for the PR @aeozyalcin , good stuff 🦾 I've been running yolox_tiny for a couple of days now and comparing it with ssdlite (default one). Basically my setup is detect at 1080p 4/3 in both scenarios and what I noticed is that yolox_tiny is returning smaller percentages than ssdlite for objects in the same positions with pretty much the same luminosity and setup. Can provide examples but in general if ssdlite thinks there's 90% chance to have a person, yolox_tiny is around 7-8% less than that plus it gives me more false positives. Will continue testing but wondering if there is a logical explanation for these smaller detect percentages, taking into account that the model is bigger for yolox but the detect resolution remained the same in both scenarios. Would it be worth it to increase the detect resolution ?

If you would like to share your config, I can take a look at it. I want to see the config of the camera in question, and how you have the openvino detector defined in your yaml.

hey, yes, sorry for the late reply, here it is:

mqtt:
  host: mqtt
  port: 1883
  topic_prefix: frigate012
  client_id: frigate012
database:
  path: /config/frigate.db
logger:
  default: info
  logs:
    frigate.event: debug
detectors:
  ov:
    type: openvino
    device: CPU
    model:
      path: /frigate_models/yolox_tiny/public/yolox-tiny/FP32/yolox-tiny.xml

model:
  width: 416
  height: 416
  input_tensor: nchw
  input_pixel_format: bgr
  model_type: yolox
  labelmap_path: /frigate_models/dataset/coco_80cl.txt
ffmpeg:
  hwaccel_args: preset-vaapi
  input_args: preset-http-reolink
  output_args:
    record: preset-record-generic-audio-aac
detect:
  width: 1440
  height: 1080
  fps: 7
  enabled: True
motion:
  improve_contrast: True

objects:
  track:
    - person
    - cat
  filters:
    person:
      min_ratio: 0.3
      max_ratio: 0.8
      min_score: 0.75
      threshold: 0.85
    cat:
      min_score: 0.70
      threshold: 0.80
      max_area: 70000

record:
  enabled: True
  expire_interval: 60
  retain:
    days: 10
    mode: all
  events:
    pre_capture: 5
    post_capture: 5
    objects:
      - person
      - cat
    required_zones: []
    retain:
      default: 15
      mode: active_objects
      objects:
        person: 20

snapshots:
  enabled: True
  clean_copy: True
  timestamp: False
  bounding_box: True
  crop: False
  #height: 175
  required_zones: []
  retain:
    default: 15
    objects:
      person: 20

go2rtc:
  streams:
    camera1:
      - "ffmpeg:http://192.168.1.9/flv?port=1935&app=bcs&stream=channel0_main.bcs&user=admin&password="
      - ffmpeg:camera1#audio=opus

cameras:
  camera1:
    detect:
      enabled: True
    record:
      enabled: True
    motion:
      mask:
        - 828,104,1440,149,1440,0,0,0,0,367
    ffmpeg:
      inputs:
        - path: rtsp://127.0.0.1:8554/camera1?video=copy&audio=aac
          input_args: preset-rtsp-restream
          roles:
            - record
            - detect
            - restream
    objects:
      filters:
        person:
          min_area: 15000
aeozyalcin commented 1 year ago

Very interesting. Nothing stands out to me as something that could be causing the poor performance. I am running the detector on a 720p/1080p stream, so it's in the similar resolution range as what you are running yours at. The only thing I can think of is maybe the subjects are too far, but I wouldn't be able to compare to mine without seeing a picture from your camera.

Few unrelated remarks, I see you are using the CPU detector with FP32 precision. Is there a reason why you aren't using FP16 with the GPU detector instead? I assume you have an Intel CPU, so you probably also have an Intel integrated GPU?

PS: I just got yolov5 working in Frigate with OpenVino, and it's working really promising so far. I will create a PR when I feel it's ready. I love how flexible Yolov5 is.

alez007 commented 1 year ago

Very interesting. Nothing stands out to me as something that could be causing the poor performance. I am running the detector on a 720p/1080p stream, so it's in the similar resolution range as what you are running yours at. The only thing I can think of is maybe the subjects are too far, but I wouldn't be able to compare to mine without seeing a picture from your camera.

thanks for checking 👍

Few unrelated remarks, I see you are using the CPU detector with FP32 precision. Is there a reason why you aren't using FP16 with the GPU detector instead? I assume you have an Intel CPU, so you probably also have an Intel integrated GPU?

It's unrelated to frigate, I got a problem with kernel 5.19 sometimes GPU is hanging and until I have time to revert to 5.15, I use CPU. Regarding 32 vs 16 precision, I'm not sure why I chose 32 tbh, probably because my lack of knowledge around what to use in what circumstances and laziness to google it. You think I should make it 16 ?

PS: I just got yolov5 working in Frigate with OpenVino, and it's working really promising so far. I will create a PR when I feel it's ready. I love how flexible Yolov5 is.

fantastic work 🥳 looking forward to trying it out 🦾

Caros2017 commented 1 year ago

Very interesting. Nothing stands out to me as something that could be causing the poor performance. I am running the detector on a 720p/1080p stream, so it's in the similar resolution range as what you are running yours at. The only thing I can think of is maybe the subjects are too far, but I wouldn't be able to compare to mine without seeing a picture from your camera.

Few unrelated remarks, I see you are using the CPU detector with FP32 precision. Is there a reason why you aren't using FP16 with the GPU detector instead? I assume you have an Intel CPU, so you probably also have an Intel integrated GPU?

PS: I just got yolov5 working in Frigate with OpenVino, and it's working really promising so far. I will create a PR when I feel it's ready. I love how flexible Yolov5 is.

Really wondering what's the promising part about yolov5 versus yolox-tiny. Does it perform better? Is it more accurate?

Ps. Have you seen that they just released yolov8 ;)

aeozyalcin commented 1 year ago

@alez007 as an FYI, running quantized version of the models on CPU will give you the most throughput with minimal cost of reduced accuracy. At the very least, I'd recommend running the FP16 model on your CPU for increased throughput. If you are adventurous, that's where quantization comes in, and would give you order of magnitude higher throughput on your CPU. I run the FP16 model on my Intel integrated GPU, as my GPU won't natively support quantized models.

@Caros2017 So far, Yolov5 is performing better than Yolox-tiny in terms of accuracy. Its person detections are with 90+% confidence and I have not yet experienced any false positives. The thing I love about Yolov5 is how easy it was to export a 416x416 version of the yolov5s model, which runs at a similar speed to yolox_tiny on my particular system. I have seen yolov8, but have yet to look into it in detail :)

nickp27 commented 1 year ago

@alez007 as an FYI, running quantized version of the models on CPU will give you the most throughput with minimal cost of reduced accuracy. At the very least, I'd recommend running the FP16 model on your CPU for increased throughput. If you are adventurous, that's where quantization comes in, and would give you order of magnitude higher throughput on your CPU. I run the FP16 model on my Intel integrated GPU, as my GPU won't natively support quantized models.

@Caros2017 So far, Yolov5 is performing better than Yolox-tiny in terms of accuracy. Its person detections are with 90+% confidence and I have not yet experienced any false positives. The thing I love about Yolov5 is how easy it was to export a 416x416 version of the yolov5s model, which runs at a similar speed to yolox_tiny on my particular system. I have seen yolov8, but have yet to look into it in detail :)

What are you using to quantize the models? Do you have another dataset collected?

aeozyalcin commented 1 year ago

@alez007 as an FYI, running quantized version of the models on CPU will give you the most throughput with minimal cost of reduced accuracy. At the very least, I'd recommend running the FP16 model on your CPU for increased throughput. If you are adventurous, that's where quantization comes in, and would give you order of magnitude higher throughput on your CPU. I run the FP16 model on my Intel integrated GPU, as my GPU won't natively support quantized models. @Caros2017 So far, Yolov5 is performing better than Yolox-tiny in terms of accuracy. Its person detections are with 90+% confidence and I have not yet experienced any false positives. The thing I love about Yolov5 is how easy it was to export a 416x416 version of the yolov5s model, which runs at a similar speed to yolox_tiny on my particular system. I have seen yolov8, but have yet to look into it in detail :)

What are you using to quantize the models? Do you have another dataset collected?

No, I am using the regular COCO trained models. I downloaded a condensed version of the COCO dataset and used that for quantization. Like I said, because my GPU doesn't natively support uint8 precision, I didn't play around with the quantized model much.

@Caros2017 I just got yolov8 working as well ;) I will report back on my thoughts.

Caros2017 commented 1 year ago

@Caros2017 I just got yolov8 working as well ;) I will report back on my thoughts.

That is so awesome! On paper it's better then yolov5. Looking forward to your experience, pull request and merge so I can test it myself ;)

Again: great work!

aeozyalcin commented 1 year ago

https://github.com/aeozyalcin/frigate/tree/openvino_yolov5_8

Here is the branch for those of you who want to try out yolov5 and/or yolov8 in Frigate. I confirmed both v5 and v8 are working. I will have a PR by end of this weekend.

aeozyalcin commented 1 year ago

https://github.com/blakeblackshear/frigate/pull/5523

Caros2017 commented 1 year ago

@aeozyalcin haven't had te time yet to build the dev container myself, so I am waiting for the next release.

How is your experience so far between all the different versions?

NickM-27 commented 1 year ago

There is no need to build the dev container. All commits to the dev branch automatically have builds made which can be pulled and used

aeozyalcin commented 1 year ago

@aeozyalcin haven't had te time yet to build the dev container myself, so I am waiting for the next release.

How is your experience so far between all the different versions?

Here is how you can get the lastest dev container that should already have all the YOLO goodness:

docker pull ghcr.io/blakeblackshear/frigate:dev-318240c

I have been using a custom trained Yolov8-nano model on my J5005 Pentium Silver machine. I get ~30ms inference time which is great. I preferred Yolov8 over Yolov5 because it trained so much quicker. Otherwise, the reported theoretical accuracy of the stock Yolov5 small model is comparable to Yolov8 nano model, except Yolov8 nano inference is about half the time. This was the other reason why I went with yolov8 over v5.

nickp27 commented 1 year ago

Any ins

@aeozyalcin haven't had te time yet to build the dev container myself, so I am waiting for the next release. How is your experience so far between all the different versions?

Here is how you can get the lastest dev container that should already have all the YOLO goodness:

docker pull ghcr.io/blakeblackshear/frigate:dev-318240c

I have been using a custom trained Yolov8-nano model on my J5005 Pentium Silver machine. I get ~30ms inference time which is great. I preferred Yolov8 over Yolov5 because it trained so much quicker. Otherwise, the reported theoretical accuracy of the stock Yolov5 small model is comparable to Yolov8 nano model, except Yolov8 nano inference is about half the time. This was the other reason why I went with yolov8 over v5.

Any instructions or guidance on how you downloaded, prepared and trained your model?

aeozyalcin commented 1 year ago

Training is a fairly advanced topic, especially if you haven't done it before. But my Yolov8 training flow revolves around this notebook. This walks you through the steps required to train your own model. Afterwards, it spits out a checkpoint with .pt extension.

If want to skip training your own model, and start with the stock COCO trained Yolov8 models, you can skip the previous paragraph and start here. You can then use this notebook to convert the Yolov8 model of your choice to onnx format. You can also change the input dimensions of the model using the sliders in this notebook. I am running 416x416 models.

Technically, you can go straight to openvino format using this converter, by changing the following line:

model.export(format="onnx", imgsz=[input_height,input_width], optimize=optimize_cpu)

to

model.export(format="openvino", imgsz=[input_height,input_width])

I haven't looked into whether the output model is FP16 or FP32 precision. For my system, my iGPU supports FP16, so I usually convert my models manually using the model_optimizer that comes with the OpenVino package, which allows me to specify FP16 precision.

Here is how I use the model_optimizer to convert onnx models to OpenVino models. This step is unnecessary if you changed the YoloV8 model exporter notebook to output OpenVino models already.

python3 /usr/local/lib/python3.8/dist-packages/openvino/tools/mo/mo.py --input_model /content/last.onnx --model_name yolov8n_custom10 -s 255 --reverse_input_channels --compress_to_fp16 --input_shape [1,3,416,416] --output_dir /content/yolov8n_custom10
nickp27 commented 1 year ago

Training is a fairly advanced topic, especially if you haven't done it before. But my Yolov8 training flow revolves around this notebook. This walks you through the steps required to train your own model. Afterwards, it spits out a checkpoint with .pt extension.

If want to skip training your own model, and start with the stock COCO trained Yolov8 models, you can skip the previous paragraph and start here. You can then use this notebook to convert the Yolov8 model of your choice to onnx format. You can also change the input dimensions of the model using the sliders in this notebook. I am running 416x416 models.

Technically, you can go straight to openvino format using this converter, by changing the following line:

model.export(format="onnx", imgsz=[input_height,input_width], optimize=optimize_cpu)

to

model.export(format="openvino", imgsz=[input_height,input_width])

I haven't looked into whether the output model is FP16 or FP32 precision. For my system, my iGPU supports FP16, so I usually convert my models manually using the model_optimizer that comes with the OpenVino package, which allows me to specify FP16 precision.

Here is how I use the model_optimizer to convert onnx models to OpenVino models. This step is unnecessary if you changed the YoloV8 model exporter notebook to output OpenVino models already.

python3 /usr/local/lib/python3.8/dist-packages/openvino/tools/mo/mo.py --input_model /content/last.onnx --model_name yolov8n_custom10 -s 255 --reverse_input_channels --compress_to_fp16 --input_shape [1,3,416,416] --output_dir /content/yolov8n_custom10

That's super helpful! For the pretrained model, do you know if coco_80cl.txt is the right labelmap? I cant seem to find any confirmation on Google (maybe because of how new the model is).

aeozyalcin commented 1 year ago

Yep!

Caros2017 commented 1 year ago

Very nice work! I now run frigate:dev-318240c with a default COCO trained yolov8n model on 416*416 resolution as described here.

Only thing I have noticed that if I use input_tensor: nchw as described in the newest docs instead of input_tensor: nhwc I get a ZeroDivisionError: File "/opt/frigate/frigate/video.py", line 562, in detect ratio = width / height ZeroDivisionError: division by zero

Everything is running now with nhwc. Not sure if detection is working fine now. Coming days will tell :)

If it works I want to test it agains 640*640 yolov8x

aeozyalcin commented 1 year ago

Hmm that's interesting. I also wonder if that version already has the input normalized, and the color order reversed.

Let's take the guessing out of the game. I have made this Colab for you guys. Give this a try and let me know. Convert YOLOv8 to OpenVino for Frigate.ipynb

nickp27 commented 1 year ago

Hmm that's interesting. I also wonder if that version already has the input normalized, and the color order reversed.

Let's take the guessing out of the game. I have made this Colab for you guys. Give this a try and let me know. Convert YOLOv8 to OpenVino for Frigate.ipynb

Once again amazing help. Seems to be working perfectly. Ill continue with my journey to try and build a pruned coco trained model that only has people, dogs and cats in it, but thats for another day.

Caros2017 commented 1 year ago

Hmm that's interesting. I also wonder if that version already has the input normalized, and the color order reversed.

Let's take the guessing out of the game. I have made this Colab for you guys. Give this a try and let me know. Convert YOLOv8 to OpenVino for Frigate.ipynb

Very nice! I did the same yesterday with local python, but instead of parsing the .onxx I parsed it directly to openvino as stated above. Probably that was not good, since the model was accepted by Frigate, but it didn't show any hits and my gpu was drawing a lot more energy (like 10 Watts very often). With your colab it's working, also with the color nchw

One question though. If I run it on my onboard intel GPU, do I need "optimize_cpu" to be Trueor False?

aeozyalcin commented 1 year ago

Leave that true. It's for Nvidia GPUs. FP16 with optimize_cpu true is most likely the most optimal for your internal Intel GPU.

aeozyalcin commented 1 year ago

Out of curiosity, what inference time are people getting, and on what hardware are you running it on? And how do you feel the YOLOv8 accuracy/performance is compared to the stock SSD OpenVINO model?

nickp27 commented 1 year ago

Out of curiosity, what inference time are people getting, and on what hardware are you running it on? And how do you feel the YOLOv8 accuracy/performance is compared to the stock SSD OpenVINO model?

I am on a J5005 getting 32-34ms on yolov8n 320x320 - but need to tweak detection thresholds. What is your config.yml?