daiyanbao / libyuv

Automatically exported from code.google.com/p/libyuv
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

InterpolateRow rounding inconsistent #535

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
InterpolateRow for SSSE3/AVX2 uses pavgb for some cases, and pmaddubsw for 
general case, but without rounding.
the pmaddubsw version using a 7 bit interpolant.

Using a code similar to BlendPlane, the image could be biased by -128, then use 
full 8 bit interpolant.  After pmaddubsw, add 128 and round.
Make C version similar.

Caveat Previously exactly 100% would specialize to do a memcpy, which avoids 
memory access that cold segfault.

Original issue reported on code.google.com by fbarch...@chromium.org on 16 Dec 2015 at 9:25

GoogleCodeExporter commented 8 years ago
Using signed math requires a sub and an add.

ARGBInterpolate192_Opt (5506 ms)
ARGBInterpolate128_Opt (5377 ms)
ARGBInterpolate0_Opt (4909 ms)
ARGBInterpolate255_Opt (4854 ms)
ARGBInterpolate64_Opt (4776 ms)

Using the existing code but with an add
ARGBInterpolate255_Opt (4916 ms)
ARGBInterpolate64_Opt (4363 ms)
ARGBInterpolate192_Opt (4324 ms)
ARGBInterpolate128_Opt (3500 ms)
ARGBInterpolate0_Opt (2725 ms)

C code adjusted to 7 bit math with rounding.

Original comment by fbarch...@google.com on 17 Dec 2015 at 6:26

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/f4447745ae646749562cff00e9d8fbc3b9043580

commit f4447745ae646749562cff00e9d8fbc3b9043580
Author: Frank Barchard <fbarchard@google.com>
Date: Thu Dec 17 23:24:06 2015

Add rounding to InterpolateRow for improved quality and consistency.

Remove inaccurate specializations for 1/4 and 3/4, since they round
incorrectly.  Specialize for 100% and 50% are kept due to performance.
Make C and ARM code match SSSE3.
Make unittests expect zero difference.

BUG=libyuv:535
R=harryjin@google.com

Review URL: https://codereview.chromium.org/1533643005 .

[modify] 
http://crrev.com/f4447745ae646749562cff00e9d8fbc3b9043580/include/libyuv/row.h
[modify] 
http://crrev.com/f4447745ae646749562cff00e9d8fbc3b9043580/source/planar_function
s.cc
[modify] 
http://crrev.com/f4447745ae646749562cff00e9d8fbc3b9043580/source/row_any.cc
[modify] 
http://crrev.com/f4447745ae646749562cff00e9d8fbc3b9043580/source/row_common.cc
[modify] 
http://crrev.com/f4447745ae646749562cff00e9d8fbc3b9043580/source/row_gcc.cc
[modify] 
http://crrev.com/f4447745ae646749562cff00e9d8fbc3b9043580/source/row_neon.cc
[modify] 
http://crrev.com/f4447745ae646749562cff00e9d8fbc3b9043580/source/row_neon64.cc
[modify] 
http://crrev.com/f4447745ae646749562cff00e9d8fbc3b9043580/source/row_win.cc
[modify] 
http://crrev.com/f4447745ae646749562cff00e9d8fbc3b9043580/source/scale.cc
[modify] 
http://crrev.com/f4447745ae646749562cff00e9d8fbc3b9043580/source/scale_argb.cc
[modify] 
http://crrev.com/f4447745ae646749562cff00e9d8fbc3b9043580/source/scale_common.cc
[modify] 
http://crrev.com/f4447745ae646749562cff00e9d8fbc3b9043580/unit_test/planar_test.
cc

Original comment by bugdroid1@chromium.org on 17 Dec 2015 at 11:24

GoogleCodeExporter commented 8 years ago
math switched to rounding 7 bit fixed point.
interpolant can go 0 to 256 where 256 is 100% second row.
internally shifts to 7 bit, then +64 >> 7.

could use signed 8 bit. 
0 to 256 becomes 256 to 0 but 256 is specialized to memcpy.
so 256 - terp becomes 255 to 0

Original comment by fbarch...@google.com on 17 Dec 2015 at 11:29

GoogleCodeExporter commented 8 years ago
C, SSSE3 and AVX2 all match, but with low precision, and unittests dont test 
for exactness everywhere.

Original comment by fbarch...@google.com on 19 Dec 2015 at 4:44

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/3f4d86053e407a2016c824e40d9bbe2413b73e5c

commit 3f4d86053e407a2016c824e40d9bbe2413b73e5c
Author: Frank Barchard <fbarchard@google.com>
Date: Mon Dec 21 18:57:32 2015

avx2 interpolate use 8 bit

BUG=libyuv:535
R=dhrosa@google.com

Review URL: https://codereview.chromium.org/1535833003 .

[modify] 
http://crrev.com/3f4d86053e407a2016c824e40d9bbe2413b73e5c/README.chromium
[modify] 
http://crrev.com/3f4d86053e407a2016c824e40d9bbe2413b73e5c/include/libyuv/version
.h
[modify] 
http://crrev.com/3f4d86053e407a2016c824e40d9bbe2413b73e5c/source/row_common.cc
[modify] 
http://crrev.com/3f4d86053e407a2016c824e40d9bbe2413b73e5c/source/row_gcc.cc
[modify] 
http://crrev.com/3f4d86053e407a2016c824e40d9bbe2413b73e5c/source/row_neon.cc
[modify] 
http://crrev.com/3f4d86053e407a2016c824e40d9bbe2413b73e5c/source/row_neon64.cc
[modify] 
http://crrev.com/3f4d86053e407a2016c824e40d9bbe2413b73e5c/source/row_win.cc
[modify] 
http://crrev.com/3f4d86053e407a2016c824e40d9bbe2413b73e5c/unit_test/planar_test.
cc

Original comment by bugdroid1@chromium.org on 21 Dec 2015 at 6:57

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/2cb2e9e1ad8df0d2733b1498f1d1083070f210d4

commit 2cb2e9e1ad8df0d2733b1498f1d1083070f210d4
Author: Frank Barchard <fbarchard@google.com>
Date: Tue Dec 22 02:35:12 2015

fix for InterpolateRow_AVX2

TBR=harryjin@google.com
BUG=libyuv:535

Review URL: https://codereview.chromium.org/1543773002 .

[modify] 
http://crrev.com/2cb2e9e1ad8df0d2733b1498f1d1083070f210d4/README.chromium
[modify] 
http://crrev.com/2cb2e9e1ad8df0d2733b1498f1d1083070f210d4/include/libyuv/version
.h
[modify] 
http://crrev.com/2cb2e9e1ad8df0d2733b1498f1d1083070f210d4/source/row_gcc.cc

Original comment by bugdroid1@chromium.org on 22 Dec 2015 at 2:35

GoogleCodeExporter commented 8 years ago

Original comment by fbarch...@google.com on 22 Dec 2015 at 10:46