asmirnou / watsor

Object detection for video surveillance
MIT License
252 stars 32 forks source link

ffmpeg error while decoding rtsp stream #21

Closed firefly2442 closed 2 years ago

firefly2442 commented 2 years ago

Hello. I have three cameras all running h264 streams via RTSP. I have an NVIDIA 970 GPU. I'm looking to use the GPU for decoding the h264 stream as well as performing the object detection. I am seeing some decoding errors in the logs.

Building TensorRT engine. This may take few minutes.
TensorRT engine saved to /usr/share/watsor/model/gpu.buf
MainThread       werkzeug                 INFO    : Listening on ('0.0.0.0', 8080)
MainThread       root                     INFO    : Starting Watsor on 61057dbf1f8b with PID 1
inside           FFmpegDecoder            INFO    : [h264 @ 0x55a411e1dfe0] error while decoding MB 35 1, bytestream -27
backdoor         FFmpegDecoder            INFO    : [h264 @ 0x563a6b961340] error while decoding MB 11 29, bytestream -15
inside           FFmpegDecoder            INFO    : [h264 @ 0x55a411e1dfe0] error while decoding MB 8 4, bytestream -7
frontdoor        FFmpegDecoder            INFO    : [h264 @ 0x55770bab6a40] error while decoding MB 4 23, bytestream -7

Shelling into the container I see cuvid as an option.

watsor@57b132249a48:/$ ffmpeg -hwaccels
ffmpeg version 3.4.8-0ubuntu0.2 Copyright (c) 2000-2020 the FFmpeg developers
  built with gcc 7 (Ubuntu 7.5.0-3ubuntu1~18.04)
  configuration: --prefix=/usr --extra-version=0ubuntu0.2 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --enable-gpl --disable-stripping --enable-avresample --enable-avisynth --enable-gnutls --enable-ladspa --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librubberband --enable-librsvg --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvorbis --enable-libvpx --enable-libwavpack --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzmq --enable-libzvbi --enable-omx --enable-openal --enable-opengl --enable-sdl2 --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libopencv --enable-libx264 --enable-shared
  libavutil      55. 78.100 / 55. 78.100
  libavcodec     57.107.100 / 57.107.100
  libavformat    57. 83.100 / 57. 83.100
  libavdevice    57. 10.100 / 57. 10.100
  libavfilter     6.107.100 /  6.107.100
  libavresample   3.  7.  0 /  3.  7.  0
  libswscale      4.  8.100 /  4.  8.100
  libswresample   2.  9.100 /  2.  9.100
  libpostproc    54.  7.100 / 54.  7.100
Hardware acceleration methods:
vdpau
vaapi
cuvid

nvidia-smi and nvtop in the main OS show watsor leveraging the GPU.

The health endpoint shows the following:

{
    "cameras": [
        {
            "name": "inside",
            "fps": {
                "decoder": 5.1,
                "sieve": 5.1,
                "visual_effects": 0.0,
                "snapshot": 5.1,
                "mqtt": 5.1
            },
            "buffer_in": 0,
            "buffer_out": 0
        },
        {
            "name": "frontdoor",
            "fps": {
                "decoder": 5.1,
                "sieve": 5.1,
                "visual_effects": 0.0,
                "snapshot": 5.1,
                "mqtt": 5.1
            },
            "buffer_in": 0,
            "buffer_out": 0
        },
        {
            "name": "backdoor",
            "fps": {
                "decoder": 122.1,
                "sieve": 11.9,
                "visual_effects": 0.0,
                "snapshot": 11.9,
                "mqtt": 11.9
            },
            "buffer_in": 0,
            "buffer_out": 0
        }
    ],
    "detectors": [
        {
            "name": "GeForce GTX 970",
            "fps": 24.9,
            "fps_max": 191,
            "inference_time": 5.2
        }
    ]
}

My guess is it has something to do with what I'm passing as parameters to ffmpeg. Here is my config.yaml:

# Optional HTTP server configuration and authentication.
http:
  port: 8080

# Optional MQTT client configuration and authentication.
mqtt:
  host: 192.168.1.226
  port: 1883

# Default FFmpeg arguments for decoding video stream before detection and encoding back afterwards.
# Optional, can be overwritten per camera.
ffmpeg:
  decoder:
    - -hide_banner              # hide build options and library versions
    - -loglevel
    -  error
    - -nostdin
    - -hwaccel                   # These options enable hardware acceleration, check what's available with: ffmpeg -hwaccels
    -  cuvid
    - -hwaccel_output_format
    -  yuv420p
    - -fflags
    -  nobuffer
    - -flags
    -  low_delay
    - -fflags
    -  +genpts+discardcorrupt
    - -i                          # camera input field will follow '-i' ffmpeg argument automatically
    - -f
    -  rawvideo
    - -pix_fmt
    -  rgb24
    - -rtsp_transport             # try to prevent lost packets/frames via TCP
    -  tcp
  # encoder:                        # Encoder is optional, remove the entire list to disable.
  #   - -hide_banner
  #   - -loglevel
  #   -  error
  #   - -f
  #   -  rawvideo
  #   - -pix_fmt
  #   -  rgb24
  #   - -i                          # detection output stream will follow '-i' ffmpeg argument automatically
  #   - -an
  #   - -f
  #   -  mpegts
  #   - -vcodec
  #   -  libx264
  #   - -pix_fmt
  #   -  yuv420p
  #   - -vf
  #   - "drawtext='text=%{localtime\\:%c}': x=w-tw-lh: y=h-2*lh: fontcolor=white: box=1: boxcolor=black@0.55"

# Detect the following labels of the object detection model.
# Optional, can be overwritten per camera.
detect:
  - person:
      area: 10                    # Minimum area of the bounding box an object should have in
                                  # order to be detected. Defaults to 10% of entire video resolution.
      confidence: 50              # Confidence threshold that a detection is what it's guessed to be,
                                  # otherwise it's ruled out. 50% if not set.

# List of cameras and their configurations.
cameras:
  - inside:
      width: 640
      height: 480
      input: rtsp://admin:secret@192.168.1.111:554/cam/realmonitor?channel=1&subtype=1
  - frontdoor:
      width: 704
      height: 480
      input: rtsp://admin:secret@192.168.1.110:554/cam/realmonitor?channel=1&subtype=1
  - backdoor:
      width: 704
      height: 480
      input: rtsp://admin:secret@192.168.1.112:554/cam/realmonitor?channel=1&subtype=1
asmirnou commented 2 years ago

This is something between ffmpeg and the camera. It should be reproducible when running just ffmpeg from host machine (without watsor). Is the error the same if only one camera is connected?

With regard to cuvid I remember that I could activate it by using only the following option and no -hwaccel:

ffmpeg -hide_banner -c:v h264_cuvid -i sample.mp4 -f null /dev/null
firefly2442 commented 2 years ago

Thanks, this seems to work:

ffmpeg:
  decoder:
    - -hide_banner              # hide build options and library versions
    - -loglevel
    -  error
    - -nostdin
    #- -hwaccel                   # These options enable hardware acceleration, check what's available with: ffmpeg -hwaccels
    #-  cuvid
    #- -hwaccel_output_format
    #-  yuv420p
    - -c:v
    -  h264_cuvid                 # use GPU for h264 decoding
    - -fflags
    -  nobuffer
    - -flags
    -  low_delay
    - -fflags
    -  +genpts+discardcorrupt
    - -i                          # camera input field will follow '-i' ffmpeg argument automatically
    - -f
    -  rawvideo
    - -pix_fmt
    -  rgb24
    #- -rtsp_transport             # try to prevent lost packets/frames via TCP
    #-  tcp