dilagurung / libyuv

Automatically exported from code.google.com/p/libyuv
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

Document requirements for ARGBBlur() #278

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Can you please add requirements to the ARGBBlur() documentation?

- Minimum possible width and height valuez regarding radius? (f.i. is w=h=1 and 
radius=18 possible? I does not look like it is currently)
- radius must be > 0, right?
- What about off sized images? It there any rounding convention or requirements 
for image and cum. sum. buffers?

Thank you in advance.

Original issue reported on code.google.com by alt...@gmail.com on 17 Oct 2013 at 4:04

GoogleCodeExporter commented 9 years ago
Sorry for the typos: 
What about off sized images? -> What about odd sized images?

Original comment by alt...@gmail.com on 17 Oct 2013 at 4:05

GoogleCodeExporter commented 9 years ago
Sure.  Also note that I plan to do some work on blur - mainly we want a planar 
version (e.g. yuv).
Going from memory, radius is number of pixels around the pixel used.  The box 
would be 1+(radius * 2) for width and height.
The function has never been ported to neon.
All functions should support any size.. odd is fine.
blur relates to box filter in resize, and I noticed box filter C code didnt 
handle box size of more than 256, but blur uses an int, so it'll overflow for 
16 million pixels.
blur box size of about 111 pixels is typical for a very blurred image.

Original comment by fbarch...@google.com on 17 Oct 2013 at 5:43

GoogleCodeExporter commented 9 years ago
I added some comments to the header.
As an alternative to planar blur, I used a box filter scale down and bilinear 
scale up to achieve my goal.
The scale up is not optimized for planar scaling, like the ARGBScale is, but so 
it will be interesting to do the upsampler, and still may make sense to do 
planar blur.
Next step for blur is to do a small blur specialization.  The current code uses 
floats, since the sum of many pixels can be large.  But if the sum is less than 
65536, fixed point could be used for better performance.
I'll close this bug (doc) if the function is well enough documented.

Original comment by fbarch...@google.com on 31 Oct 2013 at 9:29

GoogleCodeExporter commented 9 years ago
I don't see blur related changes in planar_functions.h
Am I looking at the right file?
(Current rev # is 838)

Original comment by alt...@gmail.com on 4 Nov 2013 at 8:37

GoogleCodeExporter commented 9 years ago
Added a bit more info to blur header documentation.
Just look at ARGBBlur(), and the unittest in planar_functions.
If its still unclear, let me know.
I've also been optimizing blur for the use case of small amounts of blur... 
radius 5 - 11x11 or less, is about 20% faster.

Original comment by fbarch...@google.com on 8 Nov 2013 at 3:30

GoogleCodeExporter commented 9 years ago
I think the documentation is OK now
Thank you for the doc update and the speedup for small blur sizes!

Original comment by alt...@gmail.com on 8 Nov 2013 at 12:34

GoogleCodeExporter commented 9 years ago
Thanks.  This is still an area of development.  As such, your feedback is 
appreciated.
In hangouts we use blur in a number of effects, and its still a bottleneck.

There are 3 optimizations in mind.
1. The accumulation buffer is a 32 bit int.  it could be float, and then when 
used, would need 4x fewer int to floats.
2. When the area is small, using 16 bit differences would help.
3. A planar version of blur

I'm leaning toward 2 then 3 then 1.

Original comment by fbarch...@google.com on 9 Nov 2013 at 8:05

GoogleCodeExporter commented 9 years ago
Here is some feedback from our use case.

We use the blur function to "hide" some parts of screenshots.
So we have screen sized images (e.g. 1280x1024 pixels).
We call ARGBComputeCumulativeSum() once for the whole image then call 
ARGBBlur() for each area (rectangles, of any size) we want to blur, with a big 
radius (> 15).
This is indeed a performance bottleneck.

At first we had memory errors when blurring small rectangles in corners of the 
image, because the big radius (greater than width & height) was causing invalid 
read/write before/after the image memory. We now use a smaller radius for small 
rectangles and no blur at all for very tiny rects, like 2x2 which were still 
causing issues.

This case is maybe a good candidate for unit tests: blur a 1x1 rectangle in 
each corner of the image with a big radius.

I hope this can help for future developments.
Keep up the good work

Original comment by alt...@gmail.com on 18 Nov 2013 at 8:29

GoogleCodeExporter commented 9 years ago
ARGBComputeCumulativeSum() should not be called - its an internal function used 
by ARGBBlur().  The higher level, ARGBBlur() will clip the blur region to the 
image.
The cumulative sum buffer does not need to be the full image.  Here's a snippet 
to show usage:

uint8* row_raw = reinterpret_cast<uint8*>(&data()[top * width() + left]);     
// Allocate circular CumulativeSum buffer which is a ARGB int pixels.        
// 1 row for above, 1 for the pixel row, and 2 radius for above/below.        
int circular_height = radius * 2 + 2;                                         
CreateTemp(region_width * sizeof(int), circular_height);  // NOLINT           
int32* cumsum = reinterpret_cast<int32*>(temp_data());                        

libyuv::ARGBBlur(row_raw, Stride(),  // Number of bytes in 'row'.             
                 row_raw, Stride(),                                           
                 cumsum, region_width * 4,  // Number of ints with 4 channels.
                 region_width, region_height,  // Region size in pixels.      
                 radius);  // Radius for amount of blur.                      

The unittests can tests arbitrary image sizes.  ie
set LIBYUV_WIDTH=2
set LIBYUV_HEIGHT=2
out\release\libyuv_unittest --gtest_filter=*Blur*
which works on 1x1 but fails on 2x2.  But no crash.
ARGBBlur_Opt is 13 radius (27x27)
ARGBBlurSmall_Opt is 5 radius (11x11)
which tests the 2 implementations - small blur radius of 5 or less uses fixed 
point, while larger radius uses float.
Changing the tests to 55 and 2 behavior is the same.  Heres a benchmark with 
radius 55 and 2.
set LIBYUV_WIDTH=1280
set LIBYUV_HEIGHT=720
set LIBYUV_REPEAT=1000
out\release\libyuv_unittest --gtest_filter=*Blur* | grep ms
[       OK ] libyuvTest.ARGBBlur_Any (2838 ms)
[       OK ] libyuvTest.ARGBBlur_Unaligned (2816 ms)
[       OK ] libyuvTest.ARGBBlur_Invert (2815 ms)
[       OK ] libyuvTest.ARGBBlur_Opt (2771 ms)
[       OK ] libyuvTest.ARGBBlurSmall_Any (1887 ms)
[       OK ] libyuvTest.ARGBBlurSmall_Unaligned (1847 ms)
[       OK ] libyuvTest.ARGBBlurSmall_Invert (1860 ms)
[       OK ] libyuvTest.ARGBBlurSmall_Opt (1851 ms)
[----------] 8 tests from libyuvTest (18685 ms total)
Trying various heights and widths, it seems like any height is okay, but width 
less than radius has a bug.

Original comment by fbarch...@google.com on 18 Nov 2013 at 4:06

GoogleCodeExporter commented 9 years ago
I removed the explicit call to ARGBComputeCumulativeSum() and indeed the result 
is the same!

Thank you for this information and the memory efficient example.

Original comment by alt...@gmail.com on 18 Nov 2013 at 5:07

GoogleCodeExporter commented 9 years ago
I reproduced an issue on width.  The row level loop is doing

r+1 clipped left edge blur
n   unclipped middle pixels
r   clipped right edge
where n needs to be 1 or more.  Rather than add a check for n>=1, I recompute r 
(radius) based on width, and immediately return if radius is <= 0.
See r858 for that.
In the header I note that ARGBComputeCumulativeSum is internal.  Originally I 
only provided that function, and not the Blur itself.
The unittest is doing radius 55 and 2.

Original comment by fbarch...@google.com on 18 Nov 2013 at 10:39

GoogleCodeExporter commented 9 years ago

Original comment by fbarch...@google.com on 20 Nov 2013 at 12:18