Open rogerzhou-github opened 6 years ago
@rogerzhou-github thanks for your details analysis. we will probe this and check whether it is an issue.
@ruil2 Thanks! Please let me know if you need any more information.
Seems fine to me, with two questions:
Agree that the fix is confusing. I tried to fix the "R" in D + lumbdaR, but seems the number of bits is hard coded in assembly code. To avoid modify assembly code, I removed the rlumbda part by
int32_t i16_pred_sad = iBestCost - (iBestMode == 3 ? 4 : iBestMode == 0 ? 0 : 2) * iLambda;
I think we can skip this step: just add some "r*lumbda" on top of current cost. There will be no big difference.
I will provide another fix and rerun the test. I will also cover larger bit rate range.
@rogerzhou-github Next time when you post diffs please use unified diffs diff -u and paste using syntax highlighted code on github like:
```diff <your diff output> ```
to get output like:
--- a/codec/encoder/core/src/svc_base_layer_md.cpp
+++ b/codec/encoder/core/src/svc_base_layer_md.cpp
@@ -390,8 +390,15 @@ int32_t WelsMdI16x16 (SWelsFuncPtrList* pFunc, SDqLayer* pCurDqLayer, SMbCache*
pFunc->pfGetLumaI16x16Pred[iBestMode] (pDst, pDec, iLineSizeDec);
}
iIdx = 1;
+ bla bla
- blabla
It will make it much nicer to look at for all and use as patch for the devs :wink:
any update on this? Like latest BD-rate or BD-PSNR with the modification?
I think the way we estimate the number of bits for I16 mode can be improved. There are 4 prediction modes (Vertical, Horizontal, DC and Plan). For baseline profile, the type is encoded using EG coding. In current implementation, it is assumed the type value (0 ~ 3) is encoded directly into bitstream as “1”, “010”, “011”, and “00100”, which take 1, 3, 3, and 5 bits correspondingly . However, according to “7.4.5 Macroblock layer semantics” in H.264 recommendation, the prediction type is jointly encoded with two other values, the Luma CBP and Chroma CBP. In P Slices, an offset of 5 is added to the mb_type value of Intra to accommodate mb types of P Block, taking even more bits to encode the mb_type of I16x16. Furthermore, when a macroblock is encoded as Intra, chroma prediction mode is encoded, which takes 1~5 bits. The actual number of bits encoded into bitstream varies from 2 ~ 14 bits, while in current implementation it is estimated 1 ~ 4 bits. As a result, the encoder favors intra modes since the number of bits for encoding intra modes is under estimated. It causes problem at low bitrate where P_Skip mode is more efficient.
One of optional fix is provided. Our test shows that with the fix 3.0% bits can be saved and there is 4.0% speed up on arm phones.
The fix can be:
diff --git a/codec/encoder/core/src/svc_base_layer_md.cpp b/codec/encoder/core/src/svc_base_layer_md.cpp index 5acc2d92..8e3fa005 100644 --- a/codec/encoder/core/src/svc_base_layer_md.cpp +++ b/codec/encoder/core/src/svc_base_layer_md.cpp @@ -390,8 +390,15 @@ int32_t WelsMdI16x16 (SWelsFuncPtrList pFunc, SDqLayer pCurDqLayer, SMbCache* pFunc->pfGetLumaI16x16Pred[iBestMode] (pDst, pDec, iLineSizeDec); } iIdx = 1;
ifdef INTRA_16X16_BITS_EST_FIX
else
endif
ifdef INTRA_16X16_BITS_EST_FIX
else
endif
if (iCurCost < iBestCost) { iBestMode = iCurMode; iBestCost = iCurCost;