disintegration / imaging

Imaging is a simple image processing package for Go
MIT License
5.23k stars 435 forks source link

Blur performance optimization #73

Closed jasonmoo closed 6 years ago

jasonmoo commented 6 years ago

Hey just digging around for a faster blur and managed to trade a bit of memory for some speed. Offering upstream in case you want the changes.

benchmark           old ns/op     new ns/op     delta
BenchmarkBlur-4     23625266      20647452      -12.60%

benchmark           old allocs     new allocs     delta
BenchmarkBlur-4     22             30             +36.36%

benchmark           old bytes     new bytes     delta
BenchmarkBlur-4     1961043       2097189       +6.94%
coveralls commented 6 years ago

Coverage Status

Coverage remained the same at 100.0% when pulling 42f587c636c5f4d6945d209e9a831e993f065a65 on jasonmoo:master into bbcee2f5c9d5e94ca42c8b50ec847fec64a6c134 on disintegration:master.

coveralls commented 6 years ago

Coverage Status

Coverage remained the same at 100.0% when pulling 42f587c636c5f4d6945d209e9a831e993f065a65 on jasonmoo:master into bbcee2f5c9d5e94ca42c8b50ec847fec64a6c134 on disintegration:master.

disintegration commented 6 years ago

Hello,

I can confirm your benchmark results and I think it's a nice improvement. At the same time I'm not sure I understand the logic behind some of the changes.

For example, you use subslice v of the scanLine in blurVertical but not in blurHorlizontal:

    v := scanLineF[i : i+4]
    wa := v[3] * weight
    r += v[0] * wa
    g += v[1] * wa
    b += v[2] * wa

And another example - setting the calculated values to the scanLine slice first and copying it to the destination image after it (in blurHorizontal):

    scanLine[idx] = clamp(r)
    scanLine[idx+1] = clamp(g)
    scanLine[idx+2] = clamp(b)
    scanLine[idx+3] = clamp(a / wsum)
}
copy(dst.Pix[y*dst.Stride:], scanLine)

I have tested your changes one by one and it seems to me that most of the speedup is caused by creating an additional float64 slice scanLineF instead of converting ints to floats inside the innermost loop. Other changes don’t seem to add any noticeable improvement.

Could you please try to revert all but the scanlineF-related changes and see if it’s the case for you too? I would like to merge the minimum set of changes that leads to the speedup.

disintegration commented 6 years ago

Thank you for the contribution! I'll try to further improve the blur performance based on your work.

jasonmoo commented 6 years ago

Ah yes! Sorry I planned on rolling in your feedback and lost track of how long it had been. Glad you found this useful. Thanks for a great lib.

On Mon, Jun 4, 2018 at 1:59 PM Grigory Dryapak notifications@github.com wrote:

Thank you for the contribution! I'll try to further improve the blur performance based on your work.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/disintegration/imaging/pull/73#issuecomment-394444062, or mute the thread https://github.com/notifications/unsubscribe-auth/AAO0eLu4W_N5kZi2SWnMHnm2EMDpjrlUks5t5XVpgaJpZM4T-g9b .

jasonmoo commented 6 years ago

Incidentally the other changes you mentioned that seemed unrelated came up as hotspots when I profiled the code but feel free to discard them if they appear to be ineffective in the benchmarks.

On Mon, Jun 4, 2018 at 2:01 PM Jason Mooberry jason.mooberry@gmail.com wrote:

Ah yes! Sorry I planned on rolling in your feedback and lost track of how long it had been. Glad you found this useful. Thanks for a great lib.

On Mon, Jun 4, 2018 at 1:59 PM Grigory Dryapak notifications@github.com wrote:

Thank you for the contribution! I'll try to further improve the blur performance based on your work.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/disintegration/imaging/pull/73#issuecomment-394444062, or mute the thread https://github.com/notifications/unsubscribe-auth/AAO0eLu4W_N5kZi2SWnMHnm2EMDpjrlUks5t5XVpgaJpZM4T-g9b .

disintegration commented 6 years ago

I've tested it again and made a minor change (dropped usage of subslice v and removed unused idx variable in blurVertical) and left other of your changes. Thanks again.