nanguantong / libyuv

Automatically exported from code.google.com/p/libyuv
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Add 4:2:0 YUVA to ARGB & ABGR conversion for SkCanvasVideoRenderer. #473

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
For this function:
https://code.google.com/p/chromium/codesearch#chromium/src/media/blink/skcanvas_
video_renderer.cc&l=581

Could just do YV12 -> ARGB/ABGR and copy and attenuate alpha.

Original issue reported on code.google.com by dalecurtis@chromium.org on 24 Jul 2015 at 9:47

GoogleCodeExporter commented 9 years ago
There's an SkCanvasVideoRenderer unittest in media_blink_unittests

Original comment by dalecurtis@chromium.org on 24 Jul 2015 at 10:00

GoogleCodeExporter commented 9 years ago
Theres 3 ways to implement this:

1. Caller can do 2 steps: I420ToARGB and ARGBCopyYToAlpha   

ARGBCopyYToAlpha takes a plane of data and copies it to the 4th byte of ARGB.  
See planarfunctions.h
LIBYUV_API
int ARGBCopyYToAlpha(const uint8* src_y, int src_stride_y,
                     uint8* dst_argb, int dst_stride_argb,
                     int width, int height);

Advantages:
a. Allows any form of I420 conversion, including I422, J420 and ABGR 
destination.  b. works with existing libyuv.
c. both functions are highly optimized including AVX2.
Disadvantage:
a. less cache/memory friendly for large images where ARGB destination doesnt 
fit cache.

2. Implement A420ToARGB (I420 with alpha), internally doing 2 steps per row.
Advantages:
a. faster - using a row buffer for intermediate ARGB is cache friendly.
b. abstracts implementation, which can be improved in future.
Disadvantage:
a. implements a specific color space.  less flexible.

3. Implement optimized A420ToARGB.
Internally the I420ToARGB is done with 3 macros to implement the I420 fetch, 
YUV conversion, and ARGB storing.  The ARGB storing fills in 255.  This macro 
could implement a variation that fetches alpha from another pointer.
Advantage: fastest
Disadvantage: most complex, least flexible.

Original comment by fbarch...@chromium.org on 11 Aug 2015 at 6:03

GoogleCodeExporter commented 9 years ago
I420AlphaToARGB implemented in r1466

I420AlphaToARGB_Any (1373 ms)
I420AlphaToARGB_Unaligned (1625 ms)
I420AlphaToARGB_Invert (1303 ms)
I420AlphaToARGB_Opt (1302 ms)

I420ToARGB_Any (660 ms)
I420ToARGB_Unaligned (637 ms)
I420ToARGB_Invert (615 ms)
I420ToARGB_Opt (542 ms)

Original comment by fbarch...@chromium.org on 18 Aug 2015 at 6:25

GoogleCodeExporter commented 9 years ago
change is integrated into chrome.

a followup improvement would be I420AlphaToABGR and some performance 
improvements.

Original comment by fbarch...@chromium.org on 20 Aug 2015 at 11:32

GoogleCodeExporter commented 9 years ago
This is the CL that switches to libyuv::I420AlphaToARGB
https://codereview.chromium.org/1293293003/

Original comment by fbarch...@chromium.org on 21 Aug 2015 at 12:49

GoogleCodeExporter commented 9 years ago
Starting ABGR version.

Original comment by fbarch...@chromium.org on 21 Aug 2015 at 1:07

GoogleCodeExporter commented 9 years ago
Performance went from 2 steps (for android)
SkCanvasVideoRendererTest.TransparentFrame (1610 ms)
to 1 step:
SkCanvasVideoRendererTest.TransparentFrame (1256 ms)

Original comment by fbarch...@chromium.org on 27 Aug 2015 at 1:25

GoogleCodeExporter commented 9 years ago
ABGR integrated into skcanvas.

consider merging ARGB and ABGR functions into a single function and exposing 
other color spaces - BGRA, J420 etc.

consider removing premultiplication and have renderer do unattenuated alpha 
blend.

consider implmenting I420AlphaToARGB internally as a single assembly function, 
which fills in alpha as it goes, instead of filling in 255 and then doing 2nd 
step to copy, and/or third step to attenuate RGB by alpha.

Original comment by fbarch...@chromium.org on 27 Aug 2015 at 1:40