blakeblackshear / frigate

NVR with realtime local object detection for IP cameras
https://frigate.video
MIT License
19.07k stars 1.74k forks source link

[Config Support]: Can't get Nvidia hardware acceleration to work #3619

Closed ehn closed 2 years ago

ehn commented 2 years ago

Describe the problem you are having

Adding hwaccel_args: -c:v h264_cuvid to a camera's config breaks FFmpeg for that camera. Without it, everything works fine, but all CPU cores are at 100% and the load average on the system is very high.

Version

v0.11.0-rc1

Frigate config file

mqtt:
  host: redacted
  user: mosquitto
  password: redacted

detectors:
  coral0:
    type: edgetpu
    device: "pci:0"
  coral1:
    type: edgetpu
    device: "pci:1"
  coral2:
    type: edgetpu
    device: "pci:2"
  coral3:
    type: edgetpu
    device: "pci:3"
  coral4:
    type: edgetpu
    device: "pci:4"
  coral5:
    type: edgetpu
    device: "pci:5"
  coral6:
    type: edgetpu
    device: "pci:6"
  coral7:
    type: edgetpu
    device: "pci:7"

birdseye:
  mode: continuous

objects:
  track:
    - person
    - bicycle
    - car
    - motorcycle
    - bus
    - boat
    - bird
    - cat
    - dog
    - frisbee
    - sports ball
    - kite
    - skateboard
    - bottle
    - cell phone

cameras:
  redacted:
    ffmpeg:
      inputs:
        - path: rtsp://admin:{FRIGATE_RTSP_PASSWORD}@redacted:554/profile2/media.smp
          roles:
            - detect
            - record
            - rtmp
      hwaccel_args: -c:v h264_cuvid
    detect:
      width: 2592
      height: 1944
      fps: 25
  redacted:
    ffmpeg:
      inputs:
        - path: rtsp://admin:{FRIGATE_RTSP_PASSWORD}@redacted:554/profile2/media.smp
          roles:
            - detect
            - record
            - rtmp
    detect:
      width: 2592
      height: 1944
      fps: 25
  redacted:
    ffmpeg:
      inputs:
        - path: rtsp://admin:{FRIGATE_RTSP_PASSWORD}@redacted:554/profile2/media.smp
          roles:
            - detect
            - record
            - rtmp
    detect:
      width: 2592
      height: 1944
      fps: 25
  redacted:
    ffmpeg:
      inputs:
        - path: rtsp://admin:{FRIGATE_RTSP_PASSWORD}@redacted:554/profile2/media.smp
          roles:
            - detect
            - record
            - rtmp
    detect:
      width: 2592
      height: 1944
      fps: 25
  redacted:
    ffmpeg:
      inputs:
        - path: rtsp://admin:{FRIGATE_RTSP_PASSWORD}@redacted:554/profile2/media.smp
          roles:
            - detect
            - record
            - rtmp
    detect:
      width: 2592
      height: 1944
      fps: 25
  redacted:
    ffmpeg:
      inputs:
        - path: rtsp://admin:{FRIGATE_RTSP_PASSWORD}@redacted:554/profile2/media.smp
          roles:
            - detect
            - record
            - rtmp
    detect:
      width: 2592
      height: 1944
      fps: 25
  redacted:
    ffmpeg:
      inputs:
        - path: rtsp://admin:{FRIGATE_RTSP_PASSWORD}@redacted:554/profile2/media.smp
          roles:
            - detect
            - record
            - rtmp
    detect:
      width: 2592
      height: 1944
      fps: 25
  redacted:
    ffmpeg:
      inputs:
        - path: rtsp://admin:{FRIGATE_RTSP_PASSWORD}@redacted:554/profile2/media.smp
          roles:
            - detect
            - record
            - rtmp
    detect:
      width: 2592
      height: 1944
      fps: 25
  redacted:
    ffmpeg:
      inputs:
        - path: rtsp://admin:{FRIGATE_RTSP_PASSWORD}@redacted:554/profile2/media.smp
          roles:
            - detect
            - record
            - rtmp
    detect:
      width: 2592
      height: 1944
      fps: 25
  redacted:
    ffmpeg:
      inputs:
        - path: rtsp://admin:{FRIGATE_RTSP_PASSWORD}@redacted:554/profile2/media.smp
          roles:
            - detect
            - record
            - rtmp
    detect:
      width: 2592
      height: 1944
      fps: 25
  redacted:
    ffmpeg:
      inputs:
        - path: rtsp://admin:{FRIGATE_RTSP_PASSWORD}@redacted:554/profile2/media.smp
          roles:
            - detect
            - record
            - rtmp
    detect:
      width: 2592
      height: 1944
      fps: 25
  redacted:
    ffmpeg:
      inputs:
        - path: rtsp://admin:{FRIGATE_RTSP_PASSWORD}@redacted:554/profile2/media.smp
          roles:
            - detect
            - record
            - rtmp
    detect:
      width: 1920
      height: 1080
      fps: 25
  redacted:
    ffmpeg:
      inputs:
        - path: rtsp://redacted:redacted@redacted:554/stream1
          roles:
            - detect
            - record
            - rtmp
    detect:
      width: 1920
      height: 1080
      fps: 15

record:
  enabled: True
  retain_days: 14
  events:
    retain:
      default: 365

snapshots:
  enabled: True
  timestamp: True
  bounding_box: True

timestamp_style:
  format: "%Y-%m-%d %H:%M:%S"

Relevant log output

[2022-08-09 12:11:12] frigate.video                  ERROR   : redacted: Unable to read frames from ffmpeg process.
[2022-08-09 12:11:12] frigate.video                  ERROR   : redacted: ffmpeg process is not running. exiting capture thread...
[2022-08-09 12:11:29] watchdog.redacted       ERROR   : Ffmpeg process crashed unexpectedly for redacted.
[2022-08-09 12:11:29] watchdog.redacted       ERROR   : The following ffmpeg logs include the last 100 lines prior to exit.
[2022-08-09 12:11:29] ffmpeg.redacted.detect  ERROR   : [h264_cuvid @ 0x55dbb6278300] Cannot load libnvcuvid.so.1
[2022-08-09 12:11:29] ffmpeg.redacted.detect  ERROR   : [h264_cuvid @ 0x55dbb6278300] Failed loading nvcuvid.
[2022-08-09 12:11:31] frigate.video                  ERROR   : redacted: Unable to read frames from ffmpeg process.
[2022-08-09 12:11:31] frigate.video                  ERROR   : redacted: ffmpeg process is not running. exiting capture thread...

Frigate stats

No response

Operating system

Other Linux

Install method

Docker Compose

Coral version

PCIe

Any other information that may be helpful

docker-compose.yml:

version: "3.9"
services:
  frigate:
    container_name: frigate
    privileged: true # this may not be necessary for all setups
    restart: unless-stopped
    image: blakeblackshear/frigate:0.11.0-rc1
    shm_size: "1536mb"
    devices:
      - /dev/apex_0:/dev/apex_0
      - /dev/apex_1:/dev/apex_1
      - /dev/apex_2:/dev/apex_2
      - /dev/apex_3:/dev/apex_3
      - /dev/apex_4:/dev/apex_4
      - /dev/apex_5:/dev/apex_5
      - /dev/apex_6:/dev/apex_6
      - /dev/apex_7:/dev/apex_7
      - /dev/dri/card0
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /home/ehn/frigate/config.yml:/config/config.yml:ro
      - /media/datastore/frigate:/media/frigate
      - type: tmpfs # Optional: 1GB of memory, reduces SSD/SD Card wear
        target: /tmp/cache
        tmpfs:
          size: 1000000000
    ports:
      - "5000:5000"
      - "1935:1935" # RTMP feeds
    environment:
      FRIGATE_RTSP_PASSWORD: "redacted"
      PLUS_API_KEY: "redacted"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

Output from nvidia-smi (in the container):

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A2000    Off  | 00000000:65:00.0 Off |                  Off |
| 30%   41C    P0    24W /  70W |      0MiB /  6138MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
NickM-27 commented 2 years ago

Seems like somewhat related issue perhaps: https://github.com/jellyfin/jellyfin/issues/2395

Your setup looks correct as far as I can tell, might need to make an issue with jellyfin-ffmpeg folks.

ehn commented 2 years ago

I'm assuming Nvidia hardware acceleration works for other Frigate users. So if my Frigate setup is correct, do you know what's different with my system that might be causing the problem? I'm happy to file an issue with jellyfin-ffmpeg, but at the moment, I'm not sure what to put in it.

NickM-27 commented 2 years ago

Might be worth waiting a bit and getting more info, I would think it is specific to the GPU, I have no idea why that SO would not be available but we saw similar issue when a user used an old GPU unsupported. This one is new and does support decode, but like you said we have many other users that have this working as expected.

NickM-27 commented 2 years ago

Does nvidia-smi work on the host?

ehn commented 2 years ago

Yes, output from nvidia-smi on the host:

Tue Aug  9 13:12:49 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A2000    Off  | 00000000:65:00.0 Off |                  Off |
| 30%   41C    P0    26W /  70W |      0MiB /  6138MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
ehn commented 2 years ago

I have two graphics cards in the machine. Might that be causing the problem? Just trying to think of something...

NickM-27 commented 2 years ago

yes it definitely could be. I might also try removing - /dev/dri/card0 as I don't think that was in the docs and could also be confusing things. I am not 100% sure with the deploy if it is possible to specify which nvidia GPU

Minglarn commented 2 years ago

Question... Isn't NVIDIA - smi included in 11-rc1? Or do I have to install nvidia-smi first?

Getting the same error no mater what I do...

root@len-ubuntu:/home/minglarn/frigate# docker-compose up
Creating network "frigate_default" with the default driver
Creating frigate ... error

ERROR: for frigate  Cannot start service frigate: could not select device driver "nvidia" with capabilities: [[gpu]]

ERROR: for frigate  Cannot start service frigate: could not select device driver "nvidia" with capabilities: [[gpu]]
ERROR: Encountered errors while bringing up the project.
NickM-27 commented 2 years ago

Your host needs to have nvidia drivers installed

Minglarn commented 2 years ago

Suspected that... Recommended NVIDIA drivers to install ? It's an jungle out there... :)

Ubuntu with a 1060 Gforce GTX card.

ehn commented 2 years ago

Figured it out. The library was simply missing from the host. Installing the libnvidia-decode-515-server package resolved the issue. Thanks, @NickM-27, for linking to https://github.com/jellyfin/jellyfin/issues/2395! That helped me find the underlying problem.

This is on Ubuntu 22.04 btw, in case someone else has a similar problem.

Based on the output from nvidia-smi, it looks like 13 video streams are being successfully decoded on the GPU. All 20 CPU cores are still at close to 100%, and the system load is 40–60, however.

NickM-27 commented 2 years ago

Suspected that... Recommended NVIDIA drivers to install ? It's an jungle out there... :)

Ubuntu with a 1060 Gforce GTX card.

Not really sure, I believe Nvidia has official drivers released now so might be worth trying those first.

ehn commented 2 years ago

yes it definitely could be. I might also try removing - /dev/dri/card0 as I don't think that was in the docs and could also be confusing things. I am not 100% sure with the deploy if it is possible to specify which nvidia GPU

Removed the - /dev/dri/card0 line. Everything still works. If nothing else, it was useless cruft. :)

I only have one Nvidia GPU, btw. The other graphics card is something really basic built into the motherboard.

Minglarn commented 2 years ago

Witch drivers did you install? Running a ubuntu-server... Nvidia recommends:

vendor   : NVIDIA Corporation
model    : GP106 [GeForce GTX 1060 6GB]
driver   : nvidia-driver-390 - distro non-free
driver   : nvidia-driver-515-server - distro non-free
driver   : nvidia-driver-515 - distro non-free recommended
driver   : nvidia-driver-418-server - distro non-free
driver   : nvidia-driver-510-server - distro non-free
driver   : nvidia-driver-470-server - distro non-free
driver   : nvidia-driver-510 - distro non-free
driver   : nvidia-driver-470 - distro non-free
driver   : nvidia-driver-450-server - distro non-free
driver   : xserver-xorg-video-nouveau - distro free builtin

But as i'm running "server" wouldn't it be better to install server drivers?

Minglarn commented 2 years ago

Installing the libnvidia-decode-515-server

How did you install the missing decode package? Running nvidia official drivers 515.65.01 and CUDA v11.7..

Frigate throws an error: ERROR: for frigate Cannot start service frigate: could not select device driver "nvidia" with capabilities: [[gpu]]

nvidia-smi:

Tue Aug  9 22:57:42 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| 20%   29C    P8     4W / 120W |    212MiB /  6144MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1415      G   /usr/lib/xorg/Xorg                 90MiB |
|    0   N/A  N/A      1671      G   /usr/bin/gnome-shell              119MiB |
+-----------------------------------------------------------------------------+
ehn commented 2 years ago

These are the packages I've installed:

$ apt list --installed | grep nvidia | grep -v automatic

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

libnvidia-decode-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]
nvidia-docker2/bionic,now 2.11.0-1 all [installed]
nvidia-headless-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]
nvidia-utils-515-server/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed]
Minglarn commented 2 years ago

That's odd... Seems that all is in place.... However .. still not working..


minglarn@len-ubuntu:~$ apt list --installed | grep nvidia | grep -v automatic

WARNING: apt does not have a stable CLI interface. Use with caution in scripts.

libnvidia-cfg1-515/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,auto-removable]
libnvidia-common-515/jammy-updates,jammy-updates,jammy-security,jammy-security,now 515.65.01-0ubuntu0.22.04.1 all [installed,auto-removable]
libnvidia-decode-515/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,auto-removable]
libnvidia-egl-wayland1/jammy,now 1:1.1.9-1.1 amd64 [installed,auto-removable]
libnvidia-encode-515/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,auto-removable]
libnvidia-extra-515/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,auto-removable]
libnvidia-fbc1-515/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,auto-removable]
libnvidia-gl-515/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,auto-removable]
linux-modules-nvidia-515-generic-hwe-22.04/jammy-security,now 5.15.0-43.46+1 amd64 [installed,upgradable to: 5.15.0-46.49]
nvidia-compute-utils-515/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,auto-removable]
nvidia-kernel-source-515/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,auto-removable]
nvidia-prime/jammy,jammy,now 0.8.17.1 all [installed]
nvidia-settings/jammy,now 510.47.03-0ubuntu1 amd64 [installed,auto-removable]
nvidia-utils-515/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,auto-removable]
xserver-xorg-video-nvidia-515/jammy-updates,jammy-security,now 515.65.01-0ubuntu0.22.04.1 amd64 [installed,auto-removable]
Minglarn commented 2 years ago

Ok, it's working now... A full reboot and it seems that the drivers are OK..

| NVIDIA-SMI 515.65.01    Driver Version: 515.65.01    CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
| 20%   44C    P2    24W / 120W |    911MiB /  6144MiB |      3%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2055      G   /usr/lib/xorg/Xorg                 90MiB |
|    0   N/A  N/A      2612      G   /usr/bin/gnome-shell               55MiB |
|    0   N/A  N/A     19271      C   ffmpeg                             80MiB |
|    0   N/A  N/A     19282      C   ffmpeg                             80MiB |
|    0   N/A  N/A     19284      C   ffmpeg                             80MiB |
|    0   N/A  N/A     19296      C   ffmpeg                            279MiB |
|    0   N/A  N/A     19307      C   ffmpeg                             80MiB |
|    0   N/A  N/A     19308      C   ffmpeg                             80MiB |
|    0   N/A  N/A     19312      C   ffmpeg                             80MiB |
+-----------------------------------------------------------------------------+
NickM-27 commented 2 years ago

@ehn I think this should be closable now 👍

ehn commented 2 years ago

Yes, thank you for the help!

techbits00 commented 2 years ago

@ehn any chance you could mention what cameras are you using?

I have reolink cameras and i cannot get ffmpeg to work with my nvidia 980 ti. I am experimenting it on docker desktop though so thats a bit different.

Issue:
In my case there are no errors but the process does not seem to be doing anything.

+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 252 G /ffmpeg N/A | +-----------------------------------------------------------------------------+

NickM-27 commented 2 years ago

@techbits00 If you don't see any errors then it is most likely working. If you run nvidia-smi on the host (not in frigate) do you see the processes?

techbits00 commented 2 years ago

@NickM-27 Yes I got it to work. I had to restart my docker daemon and it worked. Not sure how a restart help but it worked. I am going to now try it on a linux host will see how it works over there.

Thanks for the quick response, appreciate it.

ehn commented 2 years ago

@ehn any chance you could mention what cameras are you using?

I'm using Hanwha QNV-8080R, XNP-6550RH and TNO-4030T.

dragozet commented 1 year ago

@ehn I'm curious, what "Inference Speed" do you get with your A2000 GPU? I get around 10ms on my Coral USB, wondering if getting an A2000 would be better as I could also get H264 encode/decode.

techbits00 commented 1 year ago

I have frigate running on docker inside a proxmox VM. VM has 2CPU unit of a Ryzen 2700. No HW acceleration and I am getting 6.18 with three cameras. I have mini PCIE Coral hooked through an adapter. Avg CPU load (with plenty of other containers) is about 10-15%

@dragozet

dragozet commented 1 year ago

@techbits00 nice to know. Is the mini PCIE Coral one of the "Dual" Corals? Wondering why it's faster. I'm running a Ryzen 4650G and the amd-vaapi is only helping around 3% with 4 cameras. Be nice if the A2000 would help out on the H264 side a bit more. And if it could compete with the detector side of things too.

techbits00 commented 1 year ago

@dragozet nah - its the single one. I wanted to use 980ti for this but with virtualization and all other stuff related to flags I couldn't get it to work so I gave up on it. I do not have any special configurations infact in most places I am using default configurations.

I have two reolink cameras and one amcrest doorbell. You may want to play around with the camera configs to see how it looks. I am using optimal configurations defined for Reolink cameras on frigate wiki.

Edit: Additional info: My inference speed remains in the same range and does not goes out of it even if i only use one camera. I do not know if this is going to be helpful for you but I thought I will just put it here.

ehn commented 1 year ago

@ehn I'm curious, what "Inference Speed" do you get with your A2000 GPU? I get around 10ms on my Coral USB, wondering if getting an A2000 would be better as I could also get H264 encode/decode.

@dragozet I'm not using the GPU for object detection, only video decoding. I have 8 TPUs in this machine.