Closed adeelabbas closed 1 year ago
Hi,
Thank you for bringing this issue to our attention. This seems to be due to CPU filtering portion of the pipeline, e.g., running
ffmpeg -y -f lavfi -i testsrc=duration=60:size=480x270:rate=60 -c:v mpsoc_vcu_h264 -f mp4 /dev/null
gives upwards of 840 fps:
Output #0, mp4, to '/dev/null':
Metadata:
encoder : Lavf58.76.100
Stream #0:0: Video: h264 (avc1 / 0x31637661), nv12(tv, progressive), 480x270 [SAR 1:1 DAR 16:9], q=2-31, 5000 kb/s, 60 fps, 15360 tbn
Metadata:
encoder : Lavc58.134.100 mpsoc_vcu_h264
frame= 3600 fps=847 q=-0.0 Lsize= 3132kB time=00:00:59.96 bitrate= 427.9kbits/s speed=14.1x
video:3098kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 1.099865
Addition of the filtering; however, seems to limit encoder's input throughput, for both libx264 and mpsoc_vcu_h264. Running the following simplified pipeline:
ffmpeg -y -f lavfi -i testsrc=duration=60:size=2560x1440:rate=60 -f lavfi -i testsrc=duration=60:size=2560x1440:rate=60 -f lavfi -i testsrc=duration=60:size=2560x1440:rate=60 -filter_complex "[0:v]crop=854:1440:573:0,scale=160:270:flags=fast_bilinear,setpts=PTS-STARTPTS[vatom0_0];[1:v]crop=1822:1080:0:98,scale=160:270:flags=fast_bilinear,setpts=PTS-STARTPTS[vatom0_1];[2:v]crop=854:1440:691:0,scale=160:270:flags=fast_bilinear,setpts=PTS-STARTPTS[vatom0_2]; [vatom0_0][vatom0_1][vatom0_2]hstack=inputs=3[vout0]" -map "[vout0]" -filter_complex_threads 32 -r $RATE -c:v $ENC -f mp4 /dev/null
, where RATE=60 and ENC is set to either libx264 or mpsoc_vcu_h264, generates fps of ~ 120:
Output #0, mp4, to '/dev/null': Metadata: encoder : Lavf59.16.100 Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv444p(progressive), 480x270 [SAR 1281:1280 DAR 427:240], q=2-31, 60 fps, 15360 tbn Metadata: encoder : Lavc59.18.100 libx264 Side data: cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A frame= 896 fps=126 q=-1.0 Lsize= 112kB time=00:00:14.88 bitrate= 61.5kbits/s speed=2.09x
and
EXE: /opt/xilinx/ffmpeg/bin/ffmpeg [XMA] WARNING: ffmpeg xma-vcu-encoder device warning: !! The specified Level is too low and will be adjusted !!
Output #0, mp4, to '/dev/null': Metadata: encoder : Lavf58.76.100 Stream #0:0: Video: h264 (avc1 / 0x31637661), nv12(progressive), 480x270 [SAR 1281:1280 DAR 427:240], q=2-31, 5000 kb/s, 60 fps, 15360 tbn (default) Metadata: encoder : Lavc58.134.100 mpsoc_vcu_h264 frame= 655 fps=112 q=-0.0 Lsize= 675kB time=00:00:10.88 bitrate= 508.2kbits/s speed=1.86x
Increasing the throughput by setting r to higher values, e.g., 240, in both cases, results in ~480 fps:
Output #0, mp4, to '/dev/null':
Metadata:
encoder : Lavf59.16.100
Stream #0:0: Video: h264 (avc1 / 0x31637661), yuv444p(progressive), 480x270 [SAR 1281:1280 DAR 427:240], q=2-31, 240 fps, 15360 tbn
Metadata:
encoder : Lavc59.18.100 libx264
Side data:
cpb: bitrate max/min/avg: 0/0/0 buffer size: 0 vbv_delay: N/A
More than 1000 frames duplicated 0kB time=00:00:04.63 bitrate= 0.1kbits/s dup=876 drop=0 speed=1.85x
frame= 3100 fps=489 q=-1.0 Lsize= 274kB time=00:00:12.90 bitrate= 173.9kbits/s dup=2325 drop=0 speed=2.04x
and
Output #0, mp4, to '/dev/null':
Metadata:
encoder : Lavf58.76.100
Stream #0:0: Video: h264 (avc1 / 0x31637661), nv12(progressive), 480x270 [SAR 1281:1280 DAR 427:240], q=2-31, 5000 kb/s, 240 fps, 15360 tbn (default)
Metadata:
encoder : Lavc58.134.100 mpsoc_vcu_h264
More than 1000 frames duplicated 512kB time=00:00:04.80 bitrate= 872.4kbits/s dup=870 drop=0 speed=1.91x
frame= 8960 fps=468 q=-0.0 Lsize= 4261kB time=00:00:37.32 bitrate= 935.3kbits/s dup=6720 drop=0 speed=1.95x
If possible, kindly, share:
your libx264 pipeline, as I was not able to reproduce your fps rate
frame rate and resolution of your inputs
Cheers,
Hi, Further to the above, one of our goals with U30 has been high density and high VQ real-time processing . Our current suggestion for as fast as possible encoding, for VOD assets, is to spilt the file and transcode all segments in parallel. (Similar to https://github.com/Xilinx/video-sdk/blob/v2.0/examples/ffmpeg/tutorials/13_ffmpeg_transcode_only_split_stitch.py) Hope this helps. Cheers,
Hi, I am trying to encode very low resolution (270p) video using HW encoding and trying to get faster encoding than software encoding on c6a.8x using x264. My command is as follows:
I am getting about 180 fps on vt1.6xlarge instance. On the c6a.8xlarge, I get about 360 fps using ultrafast encoding preset.
Is there any way to run Xilinx HW encoder much faster (I dont care about quality but want to run it as fast as possible)