mannuray / libyuv

Automatically exported from code.google.com/p/libyuv
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

linux top bottlenecks #492

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Investigate top bottlenecks

LIBYUV_DISABLE_AVX2=1 LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=1000 
perf record out/Release/libyuv_unittest --gtest_filter=*
perf report

 13.81%  libyuv_unittest  libyuv_unittest      [.] libyuv::libyuvTest_ScaleTestRoundToByte_Test::T◆
 13.81%  libyuv_unittest  libyuv_unittest      [.] libyuv::libyuvTest_TestRoundToByte_Test::TestBo▒
  4.94%  libyuv_unittest  libyuv_unittest      [.] ScaleAddRow_C                                  ▒
  4.07%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEvenBox_SSE2                   ▒
  3.63%  libyuv_unittest  libyuv_unittest      [.] InterpolateRow_SSSE3                           ▒
  3.57%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBMatrixRow_SSSE3                      ▒
  3.06%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBFilterCols_SSSE3                      ▒
  3.02%  libyuv_unittest  libyuv_unittest      [.] ScaleFilterCols_SSSE3                          ▒
  2.63%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDown2Box_SSE2                      ▒
  2.58%  libyuv_unittest  libyuv_unittest      [.] libyuv::ScaleARGB(unsigned char const*, int, in▒
  2.57%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565DitherRow_C                        ▒
  2.45%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBCols_SSE2                             ▒
  2.44%  libyuv_unittest  libc-2.19.so         [.] __random_r                                     ▒
  2.23%  libyuv_unittest  libyuv_unittest      [.] CopyRow_ERMS                                   ▒
  1.64%  libyuv_unittest  libyuv_unittest      [.] I422ToABGRMatrixRow_SSSE3                      ▒
  1.46%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEven_SSE2                      ▒
  1.29%  libyuv_unittest  libyuv_unittest      [.] FixedDiv_X86                                   ▒
  1.26%  libyuv_unittest  libyuv_unittest      [.] libyuv::ScaleAddCols1_C(int, int, int, int, uns▒
  1.24%  libyuv_unittest  libyuv_unittest      [.] ARGBShuffleRow_SSSE3                           ▒
  1.21%  libyuv_unittest  libyuv_unittest      [.] CumulativeSumToAverageRow_SSE2                 ▒
  1.14%  libyuv_unittest  libc-2.19.so         [.] __random                                       ▒
  1.08%  libyuv_unittest  libyuv_unittest      [.] ScaleCols_C                                    ▒
  0.99%  libyuv_unittest  libyuv_unittest      [.] ARGBToYRow_SSSE3                               ▒
  0.75%  libyuv_unittest  libyuv_unittest      [.] ComputeCumulativeSumRow_SSE2                   ▒
  0.75%  libyuv_unittest  libc-2.19.so         [.] _int_malloc       

Original issue reported on code.google.com by fbarch...@google.com on 16 Sep 2015 at 11:36

GoogleCodeExporter commented 9 years ago
r1483 removes redundent scale rounding test.

Rounding test is still top bottleneck though on linux.

 16.52%  libyuv_unittest  libyuv_unittest      [.] libyuv::libyuvTest_TestRoundToByte_Test::TestBody()

Original comment by fbarch...@google.com on 17 Sep 2015 at 5:28

GoogleCodeExporter commented 9 years ago
The following is a complete list of C functions (there should be none)

LIBYUV_FLAGS=-1 LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=999 perf 
record out/Release/libyuv_unittest --gtest_filter=*
perf report >out.txt
grep _C out.txt

     5.88%  libyuv_unittest  libyuv_unittest      [.] ScaleAddRow_C
     3.08%  libyuv_unittest  libyuv_unittest      [.] ARGBToRGB565DitherRow_C
     1.38%  libyuv_unittest  libyuv_unittest      [.] libyuv::ScaleAddCols1_C(int, int, int, int, unsigned short const*, unsigned char*)
     1.28%  libyuv_unittest  libyuv_unittest      [.] ScaleCols_C
     0.52%  libyuv_unittest  libyuv_unittest      [.] ARGBToUV411Row_C
     0.25%  libyuv_unittest  libyuv_unittest      [.] ScaleARGBRowDownEven_C
     0.14%  libyuv_unittest  libyuv_unittest      [.] libyuv::ScaleAddCols2_C(int, int, int, int, unsigned short const*, unsigned char*)
     0.07%  libyuv_unittest  libyuv_unittest      [.] ScaleColsUp2_C
     0.03%  libyuv_unittest  libyuv_unittest      [.] MirrorUVRow_C
     0.01%  libyuv_unittest  libyuv_unittest      [.] TransposeWx8_C
     0.01%  libyuv_unittest  libyuv_unittest      [.] TransposeWxH_C
     0.01%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown34_0_Box_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown34_1_Box_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] TransposeUVWx8_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown38_3_Box_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown2Box_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown34_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown38_2_Box_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] libyuv::libyuvTest_CropNV12_Test::TestBody()
     0.00%  libyuv_unittest  libyuv_unittest      [.] ScaleRowDown38_C
     0.00%  libyuv_unittest  libyuv_unittest      [.] ARGBToUVJ422Row_C

Original comment by fbarch...@google.com on 17 Sep 2015 at 6:35