OpenVisualCloud / SVT-HEVC

SVT HEVC encoder. Scalable Video Technology (SVT) is a software-based video coding technology that is highly optimized for Intel® Xeon® processors. Using the open source SVT-HEVC encoder, it is possible to spread video encoding processing across multiple Intel® Xeon® processors to achieve a real advantage of processing efficiency.
Other
517 stars 171 forks source link

num_reorder_pics is not set correctly #469

Closed rxtsolar closed 4 years ago

rxtsolar commented 4 years ago

In EbEntropyCoding.c, the following fields are hardcoded according to maxDpbSize, which is defined in function ComputeMaxDpbBuffer() by frameSize and levelIdc.

vps_max_dec_pic_buffering_minus1 vps_num_reorder_pics vps_max_latency_increase_plus1 sps_max_dec_pic_buffering_minus1 sps_num_reorder_pics sps_max_latency_increase_plus1

The consequence is that even encoded with flat low delay P mode (IPPPP...), decoder still needs to fill a certain number of DPBs before a decoded picture can be dequeued. This causes problem for applications that need low-latency decoding.

In my case, I was using Android MediaCodec as decoder. The delay is 16 frames in worst case (frameSize <= maxLumaPicSize >> 2). For a 30fps video, the decoding delay is increased by 0.5s.

Guess this could be fixed along with https://github.com/OpenVisualCloud/SVT-HEVC/issues/318.

tianjunwork commented 4 years ago

Hi @rxtsolar , thank you posting the issue. We will look into it.

intelmark commented 4 years ago

Hi @rxtsolar - Are you able to provide a sample command line for your encoding? Are you specifying a -level parameter and if so, would decreasing the level help reduce the calculated maxDpbSize?

Do you think the maxDpbSize calculation in EbEntropyCoding.c is incorrect?

Thanks for any clarification

rxtsolar commented 4 years ago

Hi @intelmark - sorry for the confusion.

I meant to say It is those max_num_reorder_pics fields in VPS and SPS that are not set correctly. They are too conservative because they are hardcoded from maxDpbSize. Here is an example:

ffmpeg -f lavfi -i testsrc=1920x1080:60:2,format=yuv420p -c:v libsvt_hevc -hielevel 0 -bl_mode 1 -g 30 -sc_detection 0 svt.h265

vs

ffmpeg -f lavfi -i testsrc=1920x1080:60:2,format=yuv420p -c:v libx265 -x265-params bframes=0:keyint=30:min_keyint=30:scenecut=0 x265.h265

Both of them generated 2-second 30-frame-GOP flat IPPPP bitstreams. However:

$ ffmpeg -i svt.h265 -c copy -bsf trace_headers -f null - 2>&1 | grep reorder
[AVBSFContext @ 0x3e52100] 150         vps_max_num_reorder_pics[i]                             00110 = 5
[AVBSFContext @ 0x3e52100] 182         sps_max_num_reorder_pics[i]                             00110 = 5
$ ffmpeg -i x265.h265 -c copy -bsf trace_headers -f null - 2>&1 | grep reorder
[AVBSFContext @ 0x3ba8740] 150         vps_max_num_reorder_pics[i]                                 1 = 0
[AVBSFContext @ 0x3ba8740] 180         sps_max_num_reorder_pics[i]                                 1 = 0

Some decoders (in my case Android MediaCodec) does not start to output decoded frames until sps_max_num_reorder_pics number of frames are fed to it. In this case the delay would be 5 frames. Worst case it can be 15 if some different level is used. Ideally, they should be set to 0 like libx265, instead of the max possible DPB size.

This won't cause problems for most of applications though. Only some low-latency players/apps would be affected.

tianjunwork commented 4 years ago

Hi @rxtsolar, thanks pointing out this with detailed explanation. While we are working on the fix, you could continue to be unblocked with below hard-coded value. Note, it only works with flat IPPP...

diff --git a/Source/Lib/Codec/EbEntropyCoding.c b/Source/Lib/Codec/EbEntropyCoding.c
index 06cb580..8f576fa 100644
--- a/Source/Lib/Codec/EbEntropyCoding.c
+++ b/Source/Lib/Codec/EbEntropyCoding.c
@@ -5447,11 +5447,13 @@ static void CodeVPS(
        // Max Dec Pic Buffering [i]
        WriteUvlc(
            bitstreamPtr,
-           scsPtr->maxDpbSize - 1/*4*/);//(1 << (scsPtr->maxTemporalLayers-syntaxItr)));
+            0);
+           //scsPtr->maxDpbSize - 1/*4*/);//(1 << (scsPtr->maxTemporalLayers-syntaxItr)));

        WriteUvlc(
            bitstreamPtr,
-           scsPtr->maxDpbSize - 1/*3*/);//(1 << (scsPtr->maxTemporalLayers-syntaxItr)));
+            0);
+           //scsPtr->maxDpbSize - 1/*3*/);//(1 << (scsPtr->maxTemporalLayers-syntaxItr)));

        WriteUvlc(
            bitstreamPtr,
@@ -6069,11 +6071,13 @@ static void CodeSPS(
        // Supports Main Profile Level 5 (Max Picture resolution 4Kx2K, up to 6 Hierarchy Levels)
        WriteUvlc(
            bitstreamPtr,
-           scsPtr->maxDpbSize - 1/*4*/);//(1 << (scsPtr->maxTemporalLayers-syntaxItr)));
+            0);
+           //scsPtr->maxDpbSize - 1/*4*/);//(1 << (scsPtr->maxTemporalLayers-syntaxItr)));

        WriteUvlc(
            bitstreamPtr,
-           scsPtr->maxDpbSize - 1/*3*/);//(1 << (scsPtr->maxTemporalLayers-syntaxItr)));
+            0);
+           //scsPtr->maxDpbSize - 1/*3*/);//(1 << (scsPtr->maxTemporalLayers-syntaxItr)));

        WriteUvlc(
            bitstreamPtr,
-- 
rxtsolar commented 4 years ago

Thanks @tianjunwork for the temp workaround. That is what I am doing right now ;)

tianjunwork commented 4 years ago

Fixed by #480