runt18 / libyuv

Automatically exported from code.google.com/p/libyuv
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

I420ToARGBMatrix - conversion with color matrix #488

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. refactor YUV to RGB conversions to take color matrix parameter
2. implement J420 and I420 using new function.  Y420

What is the expected output? 
All conversions that support bt.601 to rgb support jpeg ot bt709 color space.

What do you see instead?
J420ToARGB is only conversion.

Please use labels and text to provide additional information.
Reimplement low level YUVTORGB macro via register for row_win.cc
Port to row_gcc.cc
Port to Neon.
Move matrix to common code.
Reimplement C code via matrix and ensure exact same math.
Implement helper function to convert float matrix to SIMD color matrix.

Original issue reported on code.google.com by fbarch...@chromium.org on 27 Aug 2015 at 7:22

GoogleCodeExporter commented 8 years ago

Original comment by fbarch...@chromium.org on 1 Sep 2015 at 10:45

GoogleCodeExporter commented 8 years ago
I420ToARGB implemented with matrix at low level.
ideally all low levels accept matrix, but arm may be hard.  so first step was 
wrapper.

These are the current reference versions

static void YUVToRGBReference(int y, int u, int v, int* r, int* g, int* b) {
  *r = RoundToByte((y - 16) * 1.164 - (v - 128) * -1.596);
  *g = RoundToByte((y - 16) * 1.164 - (u - 128) * 0.391 - (v - 128) * 0.813);
  *b = RoundToByte((y - 16) * 1.164 - (u - 128) * -2.018);
}

static void YUVJToRGBReference(int y, int u, int v, int* r, int* g, int* b) {
  *r = RoundToByte(y - (v - 128) * -1.40200);
  *g = RoundToByte(y - (u - 128) * 0.34414 - (v - 128) * 0.71414);
  *b = RoundToByte(y - (u - 128) * -1.77200);
}

Original comment by fbarch...@chromium.org on 2 Sep 2015 at 10:10

GoogleCodeExporter commented 8 years ago
Todo list
C version of matrix code.
Neon code use YuvConstants.  Different constants, or make YUV setup shuffle 
values?
H420ToARGB using matrix

Original comment by fbarch...@chromium.org on 2 Sep 2015 at 11:12

GoogleCodeExporter commented 8 years ago
r1478 J420ToABGR implemented via I420ToABGRMatrixRow_SSSE3
Todo: all YUV to RGB functions take YuvConstants struct.
-row functions take YuvConstants
-convert functions pass constants
-any functions pass constants

Original comment by fbarch...@chromium.org on 3 Sep 2015 at 6:08

GoogleCodeExporter commented 8 years ago
roll attempt fails on chrome bots

[2/3] TabCaptureApiPixelTest.EndToEndWithoutRemoting (TIMED OUT)
Still waiting for the following processes to finish:
    "..\out\Release\browser_tests.exe" --allow-file-access --enable-gpu --gtest_also_run_disabled_tests --gtest_filter=TabCaptureApiPixelTest.EndToEndThroughWebRTC --single_process --test-launcher-bot-mode --test-launcher-jobs=1 --test-launcher-summary-output="c:\users\chrome~1\appdata\local\temp\isolated_out67p2wm\output.json" --user-data-dir="C:\Users\CHROME~1\AppData\Local\Temp\scoped_dir3396_6258\d3396_14993"
[ RUN      ] TabCaptureApiPixelTest.EndToEndThroughWebRTC
[2108:2844:0904/194659:WARNING:webrtcvoiceengine.cc(467)] Unexpected codec: 
ISAC/48000/1 (105)
[2108:2844:0904/194659:WARNING:webrtcvoiceengine.cc(467)] Unexpected codec: 
PCMU/8000/2 (110)
[2108:2844:0904/194659:WARNING:webrtcvoiceengine.cc(467)] Unexpected codec: 
PCMA/8000/2 (118)
[2108:2844:0904/194659:WARNING:webrtcvoiceengine.cc(467)] Unexpected codec: 
G722/8000/2 (119)
[960:3212:0904/194700:ERROR:gpu_video_decode_accelerator.cc(280)] HW video 
decode not available for profile 11
[2108:2844:0904/194700:WARNING:webrtcvoiceengine.cc(1294)] webrtc: 
(rtp_packet_history.cc:43): Purging packet history in order to re-set status.
[2108:2844:0904/194700:WARNING:webrtcvoiceengine.cc(2833)] 
SetOutputVolumePan(1, 1, 1) failed, err=8040
[2108:2844:0904/194700:WARNING:webrtcvoiceengine.cc(1294)] webrtc: 
(rtp_packet_history.cc:43): Purging packet history in order to re-set status.
[2108:2844:0904/194700:WARNING:webrtcvoiceengine.cc(1294)] webrtc: 
(rtp_packet_history.cc:43): Purging packet history in order to re-set status.
[2108:2844:0904/194700:WARNING:webrtcvoiceengine.cc(1294)] webrtc: 
(rtp_packet_history.cc:43): Purging packet history in order to re-set status.
[2108:2736:0904/194700:WARNING:webrtcsession.cc(1665)] Candidate has unknown 
component: Cand[2543930320:2:udp:2122260222:192.168.140.63:55996:local::0::] 
for content: audio

Backtrace:
    I422ToARGBMatrixRow_AVX2 [0x052EBBDA+202]
    I422ToARGBRow_AVX2 [0x052E93CC+28]
    I420ToARGB [0x052E5FB9+233]
    media::SkCanvasVideoRenderer::ConvertVideoFrameToRGBPixels [0x0501A5B2+514]
    media::VideoImageGenerator::onGetPixels [0x0501B4B1+17]
    SkImageGenerator::getPixels [0x023C70DD+125]
    SkDiscardablePixelRef::onNewLockPixels [0x02407EAC+268]
    SkPixelRef::lockPixelsInsideMutex [0x0233FE81+161]
    SkPixelRef::onRequestLock [0x0233FF4B+11]

Original comment by fbarch...@chromium.org on 8 Sep 2015 at 5:40

GoogleCodeExporter commented 8 years ago
H420 support has landed in chrome.

Original comment by fbarch...@chromium.org on 11 Sep 2015 at 12:48

GoogleCodeExporter commented 8 years ago
J422ToARGB ported to Neon 32 bit.  (but not 64 bit)

32 bit
GYP_DEFINES="OS=ios target_arch=armv7 target_subarch=both" GYP_CROSSCOMPILE=1 
GYP_GENERATOR_FLAGS="output_dir=out_ios" ./gyp_libyuv -f ninja --depth=. 
libyuv_test.gyp 
ninja -j7 -C out_ios/Debug-iphoneos libyuv_unittest
ninja -j7 -C out_ios/Release-iphoneos libyuv_unittest

64 bit
GYP_DEFINES="OS=ios target_arch=armv7 target_subarch=both" GYP_CROSSCOMPILE=1 
GYP_GENERATOR_FLAGS="output_dir=out_ios" ./gyp_libyuv -f ninja --depth=. 
libyuv_test.gyp 
ninja -j7 -C out_ios/Debug-iphoneos libyuv_unittest
ninja -j7 -C out_ios/Release-iphoneos libyuv_unittest

Original comment by phthor...@gmail.com on 15 Sep 2015 at 10:36

GoogleCodeExporter commented 8 years ago

Original comment by fbarch...@google.com on 17 Sep 2015 at 8:08

GoogleCodeExporter commented 8 years ago
instances of yuv conversion:
row_gcc.cc:#define YUVTORGB(YuvConstants)                                       

row_gcc.cc:    YUVTORGB(YuvConstants)
row_gcc.cc:    YUVTORGB(YuvConstants)
row_gcc.cc:    YUVTORGB(kYuvConstants)
row_gcc.cc:    YUVTORGB(kYuvConstants)
row_gcc.cc:    YUVTORGB(YuvConstants)
row_gcc.cc:    YUVTORGB(kYuvConstants)
row_gcc.cc:    YUVTORGB(kYuvConstants)
row_gcc.cc:    YUVTORGB(kYuvConstants)
row_gcc.cc:    YUVTORGB(kYuvConstants)
row_gcc.cc:    YUVTORGB(kYuvConstants)
row_gcc.cc:    YUVTORGB(kYuvConstants)
row_gcc.cc:#define YUVTORGB_AVX2(YuvConstants)                                  
          \
row_gcc.cc:    YUVTORGB_AVX2(kYuvConstants)
row_gcc.cc:    YUVTORGB_AVX2(kYuvConstants)
row_gcc.cc:    YUVTORGB_AVX2(kYuvConstants)
row_gcc.cc:    YUVTORGB_AVX2(kYuvConstants)
row_win.cc:#define YUVTORGB(YuvConstants)                                       
          \
row_win.cc:    YUVTORGB(YuvConstants)
row_win.cc:    YUVTORGB(YuvConstants)
row_win.cc:#define YUVTORGB_AVX2(YuvConstants) __asm {                          
          \
row_win.cc:    YUVTORGB_AVX2(ebp)
row_win.cc:    YUVTORGB_AVX2(ebp)
row_win.cc:    YUVTORGB_AVX2(ebp)
row_win.cc:    YUVTORGB_AVX2(kYuvConstants)
row_win.cc:    YUVTORGB_AVX2(kYuvConstants)
row_win.cc:    YUVTORGB_AVX2(kYvuConstants)
row_win.cc:    YUVTORGB_AVX2(kYuvConstants)
row_win.cc:    YUVTORGB_AVX2(kYuvConstants)
row_win.cc:    YUVTORGB_AVX2(ebp)
row_win.cc:#define YUVTORGB(YuvConstants) __asm {                               
          \
row_win.cc:    YUVTORGB(ebp)
row_win.cc:    YUVTORGB(ebp)
row_win.cc:    YUVTORGB(kYuvConstants)
row_win.cc:    YUVTORGB(kYuvConstants)
row_win.cc:    YUVTORGB(kYuvConstants)
row_win.cc:    YUVTORGB(ebp)
row_win.cc:    YUVTORGB(kYuvConstants)
row_win.cc:    YUVTORGB(kYuvConstants)
row_win.cc:    YUVTORGB(kYvuConstants)
row_win.cc:    YUVTORGB(kYuvConstants)
row_win.cc:    YUVTORGB(ebp)
row_win.cc:    YUVTORGB(kYuvConstants)

and neon versions
row_neon.cc:#define YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:#define YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG
row_neon64.cc:    YUV422TORGB_SETUP_REG

Original comment by fbarch...@google.com on 21 Sep 2015 at 6:26

GoogleCodeExporter commented 8 years ago
All functions converted with following caveats
I400 does not use matrix
neon64 and mips ignore yuvconstants

Original comment by fbarch...@google.com on 22 Sep 2015 at 10:51

GoogleCodeExporter commented 8 years ago
YUY2 implemented as direct asm with yuvconstants

Old profile
YUY2ToARGB_Opt (563 ms)
72.46%  libyuv_unittest  libyuv_unittest      [.] I422ToARGBMatrixRow_SSSE3
16.50%  libyuv_unittest  libyuv_unittest      [.] YUY2ToUV422Row_SSE2
7.02%  libyuv_unittest  libyuv_unittest      [.] YUY2ToYRow_SSE2

New profile
YUY2ToARGB_Opt (396 ms)
97.17%  libyuv_unittest  libyuv_unittest      [.] YUY2ToARGBRow_SSSE3

Original comment by fbarch...@google.com on 22 Sep 2015 at 10:52

GoogleCodeExporter commented 8 years ago
r1490
AVX2  YUY2ToARGB_Opt (280 ms)
SSSE3 YUY2ToARGB_Opt (397 ms)
C     YUY2ToARGB_Opt (4484 ms)

Original comment by fbarch...@google.com on 23 Sep 2015 at 6:29

GoogleCodeExporter commented 8 years ago
Closing this bug, as basic matrix functionality is done.
Followup needed for aarch64

Original comment by fbarch...@google.com on 25 Sep 2015 at 12:35

GoogleCodeExporter commented 8 years ago

Original comment by fbarch...@chromium.org on 17 Nov 2015 at 10:16