Open idontcare99999 opened 1 year ago
Yeah - the h265 path is just that bit more efficient. Pretty sure that nearly all the CPU time is external to ffmpeg for the M2M case.
I ran another test where I compiled [dev/5.1.2/rpi_import_1] and compared hwaccel against CPU-only and only drm appears to be enjoying the benefits:
.39 is h264 without hwaccel at 40 seconds CPU .30 is h264 with hwaccel at 44 seconds CPU (slower?)
.26 is h265 without hwaccel at 51 seconds CPU .25 is h265 with hwaccel at 16 seconds CPU (much faster!)
root 5052 5042 5 16:01 ? 00:00:40 ffmpeg -hide_banner -loglevel warning -avoid_negative_ts make_zero -fflags +genpts+discardcorrupt -rtsp_transport tcp -timeout 5000000 -use_wallclock_as_timestamps 1 -i rtsp://admin:scr1mag3@192.168.254.39:554/cam/realmonitor?channel=1&subtype=1 -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c copy -an /tmp/cache/driveway-%Y%m%d%H%M%S.mp4 -c copy -f flv rtmp://127.0.0.1/live/driveway -r 5 -s 704x480 -f rawvideo -pix_fmt yuv420p pipe:
root 5059 5049 2 16:01 ? 00:00:16 ffmpeg -hide_banner -loglevel warning -hwaccel drm -avoid_negative_ts make_zero -fflags +genpts+discardcorrupt -rtsp_transport tcp -timeout 5000000 -use_wallclock_as_timestamps 1 -i rtsp://admin:scr1mag3@192.168.254.25:554/cam/realmonitor?channel=1&subtype=1 -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c copy -an /tmp/cache/pasture-%Y%m%d%H%M%S.mp4 -c copy -f flv rtmp://127.0.0.1/live/pasture -r 5 -s 704x480 -f rawvideo -pix_fmt yuv420p pipe:
root 5068 5046 6 16:01 ? 00:00:44 ffmpeg -hide_banner -loglevel warning -c:v h264_v4l2m2m -avoid_negative_ts make_zero -fflags +genpts+discardcorrupt -rtsp_transport tcp -timeout 5000000 -use_wallclock_as_timestamps 1 -i rtsp://admin:scr1mag3@192.168.254.30:554/cam/realmonitor?channel=1&subtype=1 -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c copy -an /tmp/cache/barn-%Y%m%d%H%M%S.mp4 -c copy -f flv rtmp://127.0.0.1/live/barn -r 5 -s 704x480 -f rawvideo -pix_fmt yuv420p pipe:
root 5072 5057 7 16:01 ? 00:00:51 ffmpeg -hide_banner -loglevel warning -avoid_negative_ts make_zero -fflags +genpts+discardcorrupt -rtsp_transport tcp -timeout 5000000 -use_wallclock_as_timestamps 1 -i rtsp://admin:scr1mag3@192.168.254.26:554/cam/realmonitor?channel=1&subtype=1 -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c copy -an /tmp/cache/frontyard-%Y%m%d%H%M%S.mp4 -c copy -f flv rtmp://127.0.0.1/live/frontyard -r 5 -s 704x480 -f rawvideo -pix_fmt yuv420p pipe:
And here's test/4.3.4/rpi_main looking much more reasonable with hwaccel 265 using the least amount of CPU, followed by hwaccel 264, then 264, then 265:
root 671594 671586 6 18:54 ? 00:02:37 ffmpeg -hide_banner -loglevel warning -avoid_negative_ts make_zero -fflags +genpts+discardcorrupt -rtsp_transport tcp -stimeout 5000000 -use_wallclock_as_timestamps 1 -i rtsp://admin:scr1mag3@192.168.254.30:554/cam/realmonitor?channel=1&subtype=1 -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c copy -an /tmp/cache/barn-%Y%m%d%H%M%S.mp4 -c copy -f flv rtmp://127.0.0.1/live/barn -r 5 -s 704x480 -f rawvideo -pix_fmt yuv420p pipe:
root 671596 671585 4 18:54 ? 00:01:48 ffmpeg -hide_banner -loglevel warning -c:v h264_v4l2m2m -avoid_negative_ts make_zero -fflags +genpts+discardcorrupt -rtsp_transport tcp -stimeout 5000000 -use_wallclock_as_timestamps 1 -i rtsp://admin:scr1mag3@192.168.254.39:554/cam/realmonitor?channel=1&subtype=1 -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c copy -an /tmp/cache/driveway-%Y%m%d%H%M%S.mp4 -c copy -f flv rtmp://127.0.0.1/live/driveway -r 5 -s 704x480 -f rawvideo -pix_fmt yuv420p pipe:
root 671604 671595 2 18:54 ? 00:01:00 ffmpeg -hide_banner -loglevel warning -hwaccel drm -avoid_negative_ts make_zero -fflags +genpts+discardcorrupt -rtsp_transport tcp -stimeout 5000000 -use_wallclock_as_timestamps 1 -i rtsp://admin:scr1mag3@192.168.254.26:554/cam/realmonitor?channel=1&subtype=1 -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c copy -an /tmp/cache/frontyard-%Y%m%d%H%M%S.mp4 -c copy -f flv rtmp://127.0.0.1/live/frontyard -r 5 -s 704x480 -f rawvideo -pix_fmt yuv420p pipe:
root 671605 671590 8 18:54 ? 00:03:29 ffmpeg -hide_banner -loglevel warning -avoid_negative_ts make_zero -fflags +genpts+discardcorrupt -rtsp_transport tcp -stimeout 5000000 -use_wallclock_as_timestamps 1 -i rtsp://admin:scr1mag3@192.168.254.25:554/cam/realmonitor?channel=1&subtype=1 -f segment -segment_time 10 -segment_format mp4 -reset_timestamps 1 -strftime 1 -c copy -an /tmp/cache/pasture-%Y%m%d%H%M%S.mp4 -c copy -f flv rtmp://127.0.0.1/live/pasture -r 5 -s 704x480 -f rawvideo -pix_fmt yuv420p pipe:
I've been compiling various versions of your code to test the performance of hardware acceleration with frigate. It's probably something I've done to myself, but just in case it's of interest I thought I'd share that I consistently measure HEVC at about 1/3 the CPU usage of h264.
I have four identical cameras with two outputting h264 to ffmpeg -c:v h264_v4l2m2m and two outputting h265 to ffmpeg -hwaccel drm and the process listing always shows the h264 streams using three times the amount of cpu time.
I appreciate your efforts improving our platform. Let me know if there's any testing I can help with.