[PATCH] NEON optimised yuv to rgb conversion

GoogleCodeExporter commented 9 years ago

The attached patch adds NEON optimised yuv to rgb conversion along the lines of 
the SSE chroma upsampling. Total speedup is ~35%.

Original issue reported on code.google.com by mrullgard@gmail.com on 17 Jan 2013 at 9:30

Attachments:

0001-NEON-yuv-to-rgb-conversion.patch

GoogleCodeExporter commented 9 years ago

great, thanks Mans!

attached, a slightly modified version of the patch

(added some declaration in dsp.h, updated makefile.unix)

Original comment by pascal.m...@gmail.com on 17 Jan 2013 at 10:55

Added labels: Type-Review
Removed labels: Type-Defect

Attachments:

0002-NEON-yuv-to-rgb-conversion.patch

GoogleCodeExporter commented 9 years ago

Thanks for fixing that.

Original comment by mrullgard@gmail.com on 18 Jan 2013 at 1:16

GoogleCodeExporter commented 9 years ago

https://gerrit.chromium.org/gerrit/41610

I'm currently getting different output for -ppm with -noasm. Will look into it 
tomorrow.

James, you can assign this to me.

Original comment by johannko...@google.com on 18 Jan 2013 at 2:18

GoogleCodeExporter commented 9 years ago

Original comment by jz...@google.com on 18 Jan 2013 at 3:30

GoogleCodeExporter commented 9 years ago

Yes, there are a few off-by-ones in the colourspace conversion caused by 
rounding differences compared to the lookup tables the C code uses.  Neither is 
more correct than the other.

Original comment by mrullgard@gmail.com on 18 Jan 2013 at 12:13

GoogleCodeExporter commented 9 years ago

Are you referring to yuv2r() and the likes?
If so, the off-by-one seems easily fixable.
It's important to have bit-wise the same output for all platforms, so we can 
track (real) bugs easily.

Original comment by pascal.m...@gmail.com on 18 Jan 2013 at 12:20

GoogleCodeExporter commented 9 years ago

How are the constants used for the table calculations in yuv.c derived?

Original comment by mrullgard@gmail.com on 18 Jan 2013 at 1:11

GoogleCodeExporter commented 9 years ago

the constants correspond to the BT-601 conversion:

R = 1.164(Y - 16) + 1.596(V - 128)
G = 1.164(Y - 16) - 0.813(V - 128) - 0.391(U - 128)
B = 1.164(Y - 16)                   + 2.018(U - 128)

(http://www.fourcc.org/fccyvrgb.php)

But since we use an offset in the lookup table VP8kClip[] to take care of the 
1.164 * (Y-16) common term, the constants 1.596, -0.813, -0.391 and 2.018 are 
pre-divided by 1.164 before being turned into fix-point 16bit constants.

Original comment by pascal.m...@gmail.com on 18 Jan 2013 at 1:45

GoogleCodeExporter commented 9 years ago

added some comments in the code here
  =>  https://gerrit.chromium.org/gerrit/41634

Original comment by pascal.m...@gmail.com on 18 Jan 2013 at 2:09

GoogleCodeExporter commented 9 years ago

Here's a version that matches the C code exactly. It's about 4% slower than the 
original patch.

Original comment by mrullgard@gmail.com on 19 Jan 2013 at 3:34

Attachments:

0001-NEON-yuv-to-rgb-conversion.patch

GoogleCodeExporter commented 9 years ago

great! upload a new patch on https://gerrit.chromium.org/gerrit/#/c/41610/
There were some leftovers in upsampling_neon.c (CY, CVR, ..., coef[], 
cf16,cf32,u16,u128) that i removed.

Maybe the x86 version could be made faster without tables, similarly to the ARM 
one.
Need to investigate later. For now, having the same bitwise output on all 
platform is preferable.

Original comment by pascal.m...@gmail.com on 21 Jan 2013 at 3:57

GoogleCodeExporter commented 9 years ago

Those are very much used. Your modified patch will not compile.

Original comment by mrullgard@gmail.com on 21 Jan 2013 at 4:18

GoogleCodeExporter commented 9 years ago

ah! my bad. Fixed. I'll let Johann try it...

Original comment by pascal.m...@gmail.com on 21 Jan 2013 at 4:37

GoogleCodeExporter commented 9 years ago

This was merged, thanks Mans.

1de3e25 Merge "NEON optimised yuv to rgb conversion"
090b708 NEON optimised yuv to rgb conversion

Original comment by jz...@google.com on 7 Feb 2013 at 8:19

Changed state: Fixed

zzxxpp1011239740 / webp

[PATCH] NEON optimised yuv to rgb conversion #134