nyanmisaka / ffmpeg-rockchip

FFmpeg with async and zero-copy Rockchip MPP & RGA support
Other
326 stars 48 forks source link

RK3566 performance issues on debian #20

Closed colorhacker closed 5 months ago

colorhacker commented 5 months ago

Hello, I built and ran the performance test on Debian's RK3566. I compared it with gst. The performance gap is a bit big. What is the source of this problem?

OS:debian HW:RK3566 Test file duration: HH:MMSS: 00:03:12 File encoding format: 3840x2160 28.72fps VP90

ffmpeg result:

root@debian:~# time ffmpeg -hwaccel rkmpp -i TCL4K.mkv -an -benchmark -f null -
ffmpeg version 4c1997e5a8 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 12 (Debian 12.2.0-14)
  configuration: --enable-gpl --enable-version3 --disable-doc --disable-shared --enable-static --enable-libdrm --enable-rkmpp --enable-rkrga
  libavutil      58. 29.100 / 58. 29.100
  libavcodec     60. 31.102 / 60. 31.102
  libavformat    60. 16.100 / 60. 16.100
  libavdevice    60.  3.100 / 60.  3.100
  libavfilter     9. 12.100 /  9. 12.100
  libswscale      7.  5.100 /  7.  5.100
  libswresample   4. 12.100 /  4. 12.100
  libpostproc    57.  3.100 / 57.  3.100
Input #0, matroska,webm, from 'TCL4K.mkv':
  Metadata:
    COMPATIBLE_BRANDS: isomiso2avc1mp41
    MAJOR_BRAND     : isom
    MINOR_VERSION   : 512
    ENCODER         : IDMmkvlib0.1
    LANGUAGE        : und
    HANDLER_NAME    : AudioHandler
  Duration: 00:03:13.28, start: 0.000000, bitrate: 13694 kb/s
  Stream #0:0: Video: vp9 (Profile 0), yuv420p(tv, bt709/unknown/unknown), 3840x2160, SAR 1:1 DAR 16:9, 30 fps, 30 tbr, 1k tbn (default)
  Stream #0:1: Audio: opus, 48000 Hz, stereo, fltp (default)
Stream mapping:
  Stream #0:0 -> #0:0 (vp9 (vp9_rkmpp) -> wrapped_avframe (native))
Press [q] to stop, [?] for help
Output #0, null, to 'pipe:':
  Metadata:
    COMPATIBLE_BRANDS: isomiso2avc1mp41
    MAJOR_BRAND     : isom
    MINOR_VERSION   : 512
    HANDLER_NAME    : AudioHandler
    LANGUAGE        : und
    encoder         : Lavf60.16.100
  Stream #0:0: Video: wrapped_avframe, nv12(tv, bt709/unknown/unknown, progressive), 3840x2160 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 30 fps, 30 tbn (default)
    Metadata:
      encoder         : Lavc60.31.102 wrapped_avframe
[out#0/null @ 0x55bd821bb0] video:2718kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
frame= 5798 fps= 36 q=-0.0 Lsize=N/A time=00:03:13.23 bitrate=N/A speed= 1.2x
bench: utime=98.383s stime=37.910s rtime=160.763s
bench: maxrss=79848kB

real    2m41.066s
user    1m38.532s
sys     0m38.053s

gstreamer result:

root@debian:~# time gst-launch-1.0 filesrc location=TCL4K.mkv ! decodebin ! videoconvert ! videoscale ! fakesink
Setting pipeline to PAUSED ...
Pipeline is PREROLLING ...
Pipeline is PREROLLED ...
Setting pipeline to PLAYING ...
Redistribute latency...
New clock: GstSystemClock
Got EOS from element "pipeline0".
Execution ended after 0:01:47.828021557
Setting pipeline to NULL ...
Freeing pipeline ...

real    1m48.100s
user    0m35.405s
sys     0m10.839s

almost four times the difference。

nyanmisaka commented 5 months ago

-hwaccel_output_format drm_prime is required to avoid unnecessary copies.

Nvidia has a document that explains it very well. https://developer.nvidia.com/blog/nvidia-ffmpeg-transcoding-guide/

colorhacker commented 5 months ago

Thanks, it's exactly what you said.