daiyanbao / libyuv

Automatically exported from code.google.com/p/libyuv
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

YUV alpha blender #527

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
libyuv has ARGB alpha blending.  But not YUVA
implement YUV alpha blend onto YUV.

Original issue reported on code.google.com by fbarch...@google.com on 1 Dec 2015 at 3:27

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/8af0ebf8166a141e0d8798cfee1ac6b3b9365511

commit 8af0ebf8166a141e0d8798cfee1ac6b3b9365511
Author: Frank Barchard <fbarchard@google.com>
Date: Wed Dec 02 22:20:17 2015

planar blend use signed images

R=dhrosa@google.com, harryjin@google.com, jzern@chromium.org
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1491533002 .

[modify] 
http://crrev.com/8af0ebf8166a141e0d8798cfee1ac6b3b9365511/README.chromium
[modify] 
http://crrev.com/8af0ebf8166a141e0d8798cfee1ac6b3b9365511/include/libyuv/row.h
[modify] 
http://crrev.com/8af0ebf8166a141e0d8798cfee1ac6b3b9365511/include/libyuv/version
.h
[modify] 
http://crrev.com/8af0ebf8166a141e0d8798cfee1ac6b3b9365511/source/row_common.cc
[modify] 
http://crrev.com/8af0ebf8166a141e0d8798cfee1ac6b3b9365511/source/row_win.cc
[modify] 
http://crrev.com/8af0ebf8166a141e0d8798cfee1ac6b3b9365511/unit_test/planar_test.
cc

Original comment by bugdroid1@chromium.org on 2 Dec 2015 at 10:21

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/fa2618ee267642719ec51add88d9d60233cf9bfe

commit fa2618ee267642719ec51add88d9d60233cf9bfe
Author: Frank Barchard <fbarchard@google.com>
Date: Fri Dec 04 19:19:41 2015

Port BlendPlaneRow_SSSE3 to GCC

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1490273006 .

[modify] 
http://crrev.com/fa2618ee267642719ec51add88d9d60233cf9bfe/include/libyuv/row.h
[modify] 
http://crrev.com/fa2618ee267642719ec51add88d9d60233cf9bfe/source/row_gcc.cc

Original comment by bugdroid1@chromium.org on 4 Dec 2015 at 7:20

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/bea690b3e03d24f77fea45c9a8592ea480a4acd8

commit bea690b3e03d24f77fea45c9a8592ea480a4acd8
Author: Frank Barchard <fbarchard@google.com>
Date: Sun Dec 06 06:23:29 2015

AVX2 YUV alpha blender and improved unittests

AVX2 version can process 16 pixels at a time for improved memory bandwidth and 
fewer instructions.

unittests improved to test unaligned memory, and test exactness when alpha is 0 
or 255.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1505433002 .

[modify] 
http://crrev.com/bea690b3e03d24f77fea45c9a8592ea480a4acd8/README.chromium
[modify] 
http://crrev.com/bea690b3e03d24f77fea45c9a8592ea480a4acd8/include/libyuv/planar_
functions.h
[modify] 
http://crrev.com/bea690b3e03d24f77fea45c9a8592ea480a4acd8/include/libyuv/row.h
[modify] 
http://crrev.com/bea690b3e03d24f77fea45c9a8592ea480a4acd8/include/libyuv/version
.h
[modify] 
http://crrev.com/bea690b3e03d24f77fea45c9a8592ea480a4acd8/source/planar_function
s.cc
[modify] 
http://crrev.com/bea690b3e03d24f77fea45c9a8592ea480a4acd8/source/row_gcc.cc
[modify] 
http://crrev.com/bea690b3e03d24f77fea45c9a8592ea480a4acd8/source/row_win.cc
[modify] 
http://crrev.com/bea690b3e03d24f77fea45c9a8592ea480a4acd8/unit_test/planar_test.
cc

Original comment by bugdroid1@chromium.org on 6 Dec 2015 at 6:23

GoogleCodeExporter commented 8 years ago
Support for images with odd height needed.  Code will read past the end of the 
alpha plane.
source/planar_functions.cc:725: for (y = 0; y < height; ++y) {

e.g., if the height is 5, then the chroma channel has height 3.

First loop will use alpha rows 0 + 1,
next loop will use alpha rows 2 + 3
final loop will use alpha rows 4 + 5, but the alpha channel doesn't have a row 
5.

Original comment by fbarch...@chromium.org on 6 Dec 2015 at 8:51

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/b0b22f88b9f1557dd0a82260ce70748c65e54ad1

commit b0b22f88b9f1557dd0a82260ce70748c65e54ad1
Author: Frank Barchard <fbarchard@google.com>
Date: Mon Dec 07 20:02:45 2015

Unroll C version of YUV blender for improved performance.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1502343003 .

[modify] 
http://crrev.com/b0b22f88b9f1557dd0a82260ce70748c65e54ad1/source/row_common.cc

Original comment by bugdroid1@chromium.org on 7 Dec 2015 at 8:03

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/2657688e701709a5af935e6ea27f4f8967208f2d

commit 2657688e701709a5af935e6ea27f4f8967208f2d
Author: Frank Barchard <fbarchard@google.com>
Date: Mon Dec 07 20:03:20 2015

Add support for odd height YUVA alpha blending.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1507683003 .

[modify] 
http://crrev.com/2657688e701709a5af935e6ea27f4f8967208f2d/README.chromium
[modify] 
http://crrev.com/2657688e701709a5af935e6ea27f4f8967208f2d/include/libyuv/version
.h
[modify] 
http://crrev.com/2657688e701709a5af935e6ea27f4f8967208f2d/source/planar_function
s.cc
[modify] 
http://crrev.com/2657688e701709a5af935e6ea27f4f8967208f2d/unit_test/planar_test.
cc

Original comment by bugdroid1@chromium.org on 7 Dec 2015 at 8:03

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/dee77a4ebeaebc781cb3acd80aa6627fd1c7c825

commit dee77a4ebeaebc781cb3acd80aa6627fd1c7c825
Author: Frank Barchard <fbarchard@google.com>
Date: Wed Dec 09 02:20:30 2015

Optimize yuv alpha blend AVX2 code to do 32 pixels at time.

out/Release/libyuv_unittest --libyuv_width=1280 --libyuv_height=720 
--libyuv_repeat=9999 --libyuv_flags=-1 --gtest_filter=*I420Blend_Opt

Was LibYUVPlanarTest.I420Blend_Opt (2335 ms)
Now LibYUVPlanarTest.I420Blend_Opt (1937 ms)

vs SSSE3
LibYUVPlanarTest.I420Blend_Opt (2599 ms)

BUG=libyuv:527
R=dhrosa@google.com

Review URL: https://codereview.chromium.org/1505673003 .

[modify] 
http://crrev.com/dee77a4ebeaebc781cb3acd80aa6627fd1c7c825/source/cpu_id.cc
[modify] 
http://crrev.com/dee77a4ebeaebc781cb3acd80aa6627fd1c7c825/source/planar_function
s.cc
[modify] 
http://crrev.com/dee77a4ebeaebc781cb3acd80aa6627fd1c7c825/source/row_gcc.cc
[modify] 
http://crrev.com/dee77a4ebeaebc781cb3acd80aa6627fd1c7c825/source/row_win.cc

Original comment by bugdroid1@chromium.org on 9 Dec 2015 at 2:21

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/a2ea90567998b1ab93ce7fe3acc25922862e4c9c

commit a2ea90567998b1ab93ce7fe3acc25922862e4c9c
Author: Frank Barchard <fbarchard@google.com>
Date: Wed Dec 09 02:59:48 2015

BlendPlane any width.

Benchmark
out\release\libyuv_unittest --libyuv_width=1279 --libyuv_height=719 
--libyuv_repeat=999 --libyuv_flags=-1 --gtest_filter=*Blend* | sortms

Was
I420Blend_Any (2321 ms)
I420Blend_Unaligned (1684 ms)
I420Blend_Opt (1675 ms)
I420Blend_Invert (1653 ms)
BlendPlane_Invert (1556 ms)
BlendPlane_Any (1552 ms)
BlendPlane_Unaligned (1548 ms)
BlendPlane_Opt (1535 ms)
ARGBBlend_Unaligned (659 ms)
ARGBBlend_Any (596 ms)
ARGBBlend_Invert (591 ms)
ARGBBlend_Opt (508 ms)
BlendPlaneRow_Unaligned (186 ms)
BlendPlaneRow_Opt (171 ms)

Now
ARGBBlend_Any (621 ms)
ARGBBlend_Unaligned (585 ms)
ARGBBlend_Invert (564 ms)
ARGBBlend_Opt (512 ms)
I420Blend_Unaligned (347 ms)
I420Blend_Invert (345 ms)
I420Blend_Any (337 ms)
I420Blend_Opt (327 ms)
BlendPlane_Unaligned (187 ms)
BlendPlaneRow_Unaligned (187 ms)
BlendPlane_Invert (186 ms)
BlendPlane_Any (186 ms)
BlendPlaneRow_Opt (173 ms)
BlendPlane_Opt (171 ms)

which is comparable to aligned case
out\release\libyuv_unittest --libyuv_width=1280 --libyuv_height=720 
--libyuv_repeat=999 --libyuv_flags=-1 --gtest_filter=*Blend* | sortms
ARGBBlend_Any (625 ms)
ARGBBlend_Unaligned (602 ms)
ARGBBlend_Invert (508 ms)
ARGBBlend_Opt (506 ms)
I420Blend_Any (353 ms)
I420Blend_Unaligned (322 ms)
I420Blend_Invert (304 ms)
I420Blend_Opt (301 ms)
BlendPlaneRow_Unaligned (188 ms)
BlendPlane_Unaligned (186 ms)
BlendPlane_Invert (185 ms)
BlendPlane_Any (184 ms)
BlendPlaneRow_Opt (173 ms)
BlendPlane_Opt (169 ms)

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1513443002 .

[modify] 
http://crrev.com/a2ea90567998b1ab93ce7fe3acc25922862e4c9c/README.chromium
[modify] 
http://crrev.com/a2ea90567998b1ab93ce7fe3acc25922862e4c9c/include/libyuv/version
.h
[modify] 
http://crrev.com/a2ea90567998b1ab93ce7fe3acc25922862e4c9c/source/planar_function
s.cc
[modify] 
http://crrev.com/a2ea90567998b1ab93ce7fe3acc25922862e4c9c/source/row_any.cc

Original comment by bugdroid1@chromium.org on 9 Dec 2015 at 3:00

GoogleCodeExporter commented 8 years ago
Blend fails on gcc version

[==========] 986 tests from 6 test cases ran. (54351 ms total)
[  PASSED  ] 976 tests.
[  FAILED  ] 10 tests, listed below:
[  FAILED  ] LibYUVPlanarTest.BlendPlaneRow_Opt
[  FAILED  ] LibYUVPlanarTest.BlendPlaneRow_Unaligned
[  FAILED  ] LibYUVPlanarTest.BlendPlane_Opt
[  FAILED  ] LibYUVPlanarTest.BlendPlane_Unaligned
[  FAILED  ] LibYUVPlanarTest.BlendPlane_Any
[  FAILED  ] LibYUVPlanarTest.BlendPlane_Invert
[  FAILED  ] LibYUVPlanarTest.I420Blend_Opt
[  FAILED  ] LibYUVPlanarTest.I420Blend_Unaligned
[  FAILED  ] LibYUVPlanarTest.I420Blend_Any
[  FAILED  ] LibYUVPlanarTest.I420Blend_Invert

10 FAILED TESTS
  YOU HAVE 74 DISABLED TESTS

AVX2 version only.

Original comment by fbarch...@chromium.org on 9 Dec 2015 at 6:35

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/cb44936403fbc72e200ad966ae8e087f30dd535d

commit cb44936403fbc72e200ad966ae8e087f30dd535d
Author: Frank Barchard <fbarchard@google.com>
Date: Wed Dec 09 18:38:46 2015

fix typo in avx2 gcc blend.

was using wrong register on 32 pixel version.

R=harryjin@google.com, dhrosa@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1511433006 .

[modify] 
http://crrev.com/cb44936403fbc72e200ad966ae8e087f30dd535d/source/row_gcc.cc

Original comment by bugdroid1@chromium.org on 9 Dec 2015 at 6:39

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/ae55e418517651548638b27be31d1b2abaed22bb

commit ae55e418517651548638b27be31d1b2abaed22bb
Author: Frank Barchard <fbarchard@google.com>
Date: Tue Dec 15 01:25:36 2015

use rounding in scaledown by 2

When scaling down by 2 the formula should round consistently.
(a+b+c+d+2)/4
The C version did but the SSE2 version was doing 2 averages.
avg(avg(a,b),avg(c,d))
This change uses a sum, then rounds.

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:447,libyuv:527

Review URL: https://codereview.chromium.org/1513183004 .

[modify] 
http://crrev.com/ae55e418517651548638b27be31d1b2abaed22bb/README.chromium
[modify] 
http://crrev.com/ae55e418517651548638b27be31d1b2abaed22bb/include/libyuv/scale_r
ow.h
[modify] 
http://crrev.com/ae55e418517651548638b27be31d1b2abaed22bb/include/libyuv/version
.h
[modify] 
http://crrev.com/ae55e418517651548638b27be31d1b2abaed22bb/source/planar_function
s.cc
[modify] 
http://crrev.com/ae55e418517651548638b27be31d1b2abaed22bb/source/scale.cc
[modify] 
http://crrev.com/ae55e418517651548638b27be31d1b2abaed22bb/source/scale_any.cc
[modify] 
http://crrev.com/ae55e418517651548638b27be31d1b2abaed22bb/source/scale_gcc.cc
[modify] 
http://crrev.com/ae55e418517651548638b27be31d1b2abaed22bb/source/scale_win.cc
[modify] 
http://crrev.com/ae55e418517651548638b27be31d1b2abaed22bb/unit_test/planar_test.
cc

Original comment by bugdroid1@chromium.org on 15 Dec 2015 at 1:26

GoogleCodeExporter commented 8 years ago
The following revision refers to this bug:
  https://chromium.googlesource.com/libyuv/libyuv.git/+/70445ef2efb4365928ae13e6776b229379517c54

commit 70445ef2efb4365928ae13e6776b229379517c54
Author: Frank Barchard <fbarchard@google.com>
Date: Tue Dec 15 18:59:20 2015

avx2 scale down by 2 for gcc

R=dhrosa@google.com, harryjin@google.com
BUG=libyuv:527

Review URL: https://codereview.chromium.org/1520423003 .

[modify] 
http://crrev.com/70445ef2efb4365928ae13e6776b229379517c54/README.chromium
[modify] 
http://crrev.com/70445ef2efb4365928ae13e6776b229379517c54/include/libyuv/scale_r
ow.h
[modify] 
http://crrev.com/70445ef2efb4365928ae13e6776b229379517c54/include/libyuv/version
.h
[modify] 
http://crrev.com/70445ef2efb4365928ae13e6776b229379517c54/source/scale_gcc.cc
[modify] 
http://crrev.com/70445ef2efb4365928ae13e6776b229379517c54/unit_test/scale_test.c
c

Original comment by bugdroid1@chromium.org on 15 Dec 2015 at 6:59

GoogleCodeExporter commented 8 years ago
r1555 ports AVX2 scaler to GCC.  I420Blend complete.

Followup 
Port to Neon. 
Make other scalers round consistently.
Combine scale UV and blendplane into 1 step blendUV.

LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=999 LIBYUV_FLAGS=-1 
out/Release/libyuv_unittest --gtest_filter=*Blend* | sed 's/\(.*(\)\([0-9]*\)\( 
ms)\)/\2 - \1\2\3/g' | sort -rn | grep ms
517 - [       OK ] LibYUVPlanarTest.ARGBBlend_Unaligned (517 ms)
509 - [       OK ] LibYUVPlanarTest.ARGBBlend_Any (509 ms)
472 - [       OK ] LibYUVPlanarTest.ARGBBlend_Opt (472 ms)
453 - [       OK ] LibYUVPlanarTest.ARGBBlend_Invert (453 ms)
215 - [       OK ] LibYUVPlanarTest.I420Blend_Any (215 ms)
196 - [       OK ] LibYUVPlanarTest.I420Blend_Unaligned (196 ms)
186 - [       OK ] LibYUVPlanarTest.I420Blend_Invert (186 ms)
184 - [       OK ] LibYUVPlanarTest.I420Blend_Opt (184 ms)
132 - [       OK ] LibYUVPlanarTest.BlendPlaneRow_Unaligned (132 ms)
131 - [       OK ] LibYUVPlanarTest.BlendPlane_Unaligned (131 ms)
131 - [       OK ] LibYUVPlanarTest.BlendPlane_Invert (131 ms)
123 - [       OK ] LibYUVPlanarTest.BlendPlaneRow_Opt (123 ms)
121 - [       OK ] LibYUVPlanarTest.BlendPlane_Any (121 ms)
119 - [       OK ] LibYUVPlanarTest.BlendPlane_Opt (119 ms)

2.5x faster than ARGB Blend

Original comment by fbarch...@chromium.org on 15 Dec 2015 at 7:57