anthonynsimon / bild

Image processing algorithms in pure Go
MIT License
3.99k stars 213 forks source link

Switch fuzz to diff tolerance between colours #31

Closed anthonynsimon closed 8 years ago

anthonynsimon commented 8 years ago

@defart would it make sense to switch from fuzz (distance based) to color difference tolerance?

This way setting the tolerance would allow the algorithm to fill pixels by similarity instead of by distance from point. So for example if tolerance is set to 12 (uint8 range 0...255), then all colours that are within a diff of +-12 will be matched.

After some quick research it seems like most photo editing software, including Photoshop, use a version of this setting.

I created a branch with the proposal, check it out here

Photoshop result: ps

bild paint.FloodFill result (after switching to proposed method): bild

defart commented 8 years ago

Hey, maybe the nomenclature wasn't clear. But the distance i'm refering to is 'color distance', not distance from point. The color distance is basically the euclidean distance of colors on the spectrum. So what you're proposing is already implemented. The color distance is calculated in the isColorMatch function diferrence between Red values squared is the red 'distance'. Its calculated for all 3 channels with the alpha channel influencing on all 3, then it is divided by 3 to get the final color distance/difference.

anthonynsimon commented 8 years ago

Oh, totally my bad haha. Sorry about that, I clearly messed up.

I was about to call it off, but I decided to run some benchmarks to see if the reduced casting and math functions has some effect on performance and surprisingly got a -30% change in compute time on the proposal version.

Here's a before and after:

$ go test -run=NONE -bench=. -benchtime=25s -benchmem > after.txt
$ benchcmp before.txt after.txt

benchmark                old ns/op     new ns/op     delta
BenchmarkFloodFill-8     121853788     84696242      -30.49%

benchmark                old allocs     new allocs     delta
BenchmarkFloodFill-8     264409         264419         +0.00%

benchmark                old bytes     new bytes     delta
BenchmarkFloodFill-8     24397236      24393712      -0.01%

There's no considerable change in allocs or bytes, only on time per operation which is beneficial.

Other than that, the change would also be using a 0 to 255 tolerance value in the function signature instead of a percent based one. I would lean towards the former as it's what I've been using for years in Photoshop, but I guess that's subjective :)

defart commented 8 years ago

I can optimize the color matching to drop the math functions, but your proposal version is missing some things. Checking just the distance of each color is not correct, as it's taking each color separately into consideration, also ignoring alpha.

As for the 0 - 255 fuzz, it's may be not granular enough. The maximum distance i use a percent on is 510 (2*255) cause the alpha difference can add those extra 255. Maybe we can ignore that, and cap the distance at 255, so anything above would be 255 and already the max. Then a 0 - 255 fuzz would make sense.

anthonynsimon commented 8 years ago

Good point, just added alpha into the matching function.

But either I'm not using your fuzz value properly or something is wrong:

Parameters:

Source image (green channel has transparency): source

Target image Source Photoshop. Gimp and Imagemagick yield similar results. Also visually makes sense if one looks at the parameters mentioned above. target

Proposal result (on floodfill-tolerance branch) tolerance

Fuzz based (on master) fuzz

Again, I might be using it wrong, please correct me if so. Here's the test code:

// Fuzz based
img, _ := imgio.Open("in/source.png")
pass := paint.FloodFill(img, image.Point{0, 0}, color.RGBA{128, 128, 128, 255}, 12.5)
result := paint.FloodFill(pass, image.Point{255, 0}, color.RGBA{128, 128, 128, 255}, 50)
imgio.Save("out/fuzz", result, imgio.PNG)
// Tolerance based
img, _ := imgio.Open("in/source.png")
pass := paint.FloodFill(img, image.Point{0, 0}, color.RGBA{128, 128, 128, 255}, 32)
result := paint.FloodFill(pass, image.Point{255, 0}, color.RGBA{128, 128, 128, 255}, 128)
imgio.Save("out/tolerance", result, imgio.PNG)

IMO the proposal matches the target result much closer and provides much more granular control over the diff threshold; a tolerance of 128 goes half way across the gradient as expected, and when set to 32 it goes to 1/8 of the gradient, as expected.

defart commented 8 years ago

I'll take a look in the evening, and make some comparisons. At first glance, the fuzz should be around half 6.25 for the 32, since the max distance is 510 at the moment

anthonynsimon commented 8 years ago

Ok cool, I just tried it: values 3.7 for the 12.5% fill and 14.5 for the 50% gets the result on par with target and proposal.

But there are a few open points:

I would say that, the proposal should be merged unless there is a clear benefit by not doing so.

defart commented 8 years ago

I agree, that normalizing the fuzz from 0-255 would be a good idea.

Further, I'll do some optimizations on the color match to increase the speed, but my problem with the proposal branch is that the distance is not measured correctly. It checks only if one channel at a time breaks the distance. The distance needs to be measured in compound with all 3(4) channels.

I suggest trying to compare on a color wheel image, like Color Wheel

If you could hold on until this evening, I'll explain more and write some code and comparisons.

anthonynsimon commented 8 years ago

I see what you mean now. Yes you're right, the color channels should not be treated as separate values, with the color wheel it's clear how this would not work properly anymore. Your euclidean distance version gets much closer to target on the color wheel test.

I guess as an alternative we could try using another color space to check for color similarity based on perception, but that might become quite expensive in terms of performance.

defart commented 8 years ago

Yes, it think for the best perceptual uniformity we'd need to convert to CIELAB colorspace, and work with that. And it would be a significant performance hit. I think the RGBA comparison is still pretty much enough for most purposes and should be kept as the default.

Maybe make the comparison function variable, and introduce the CIELAB conversion in another function, but i think it's more than needed for 99% of use.

anthonynsimon commented 8 years ago

I went ahead and corrected the range to avoid more people using the previous one. Also one time optimized the euclidean distance calculation and got a nice performance boost.

benchmark                old ns/op     new ns/op     delta
BenchmarkFloodFill-8     94928744      69608378      -26.67%

benchmark                old allocs     new allocs     delta
BenchmarkFloodFill-8     264417         264419         +0.00%

benchmark                old bytes     new bytes     delta
BenchmarkFloodFill-8     24389554      24390124      +0.00%
defart commented 8 years ago

Hey, sorry i was busy the last days, didn't manage to find time to do an update

I've taken a look, i just have 3 comments.

  1. The alpha should be included in the distance measurement for it to be correct
  2. The distance should be divided by 3 in the end cause for zero -black the distance will now be 3 * 255
  3. In the matching it's much more performant to square the tolerance and match, than to do a Math.Sqrt on the distance
anthonynsimon commented 8 years ago

Ok, finally had time to check those things out:

  1. Gotcha, totally forgot.
  2. I don't understand what you mean exactly, but I believe it should not be divided by 3 as it's not part of the euclidian distance. By skipping the division the range came back to normal.
  3. Good point.

By the way, I updated the branch to reflect those points and added more tests to check for some more edge cases in case you want to check it out. The algorithm is basically the same as on master just without the "510 max distance" part and without averaging the RGB distances.

defart commented 8 years ago

Hey, sorry for the delayed response. What i tried to say with point #2 is:

If you want to have a tolerance range of 0-255, you need to have the 3 distances averaged.

Consider the case of maximum distance (even without alpha) the distance would be 255 squared per each channel so 255 * 255 * 3, while your maximum tolerance would be 255 * 255 when squared.

Meaning the full range of tolerance is not 0 - 255, as that would only cover one third of the full range.

anthonynsimon commented 8 years ago

No worries, recently been quite busy at work too.

The magnitude of the max RGB vector is ca. ~441 (distance from {0,0,0} to {255,255,255} in 3D space), a tolerance of 0-255 is simply a max acceptable difference, not the max possible difference. In other words, a tolerance of 255 will not flood the entire color wheel with one color (unless the starting point is the center), which is what we want.

Additionally, dividing by 3 would yield wrong values because it assumes equal presence of each channel. For example the Euclidean distance of pure red (255,0,0) would become 85 if we divide by 3 instead of the expected magnitude of 255.

I made a couple of renders to visualise the reach of the tolerance value, if RGB was plotted in a 3D cube:

rgba-distance-2 rgba-distance-1

We could cap it at ~441 instead of 255 or map 100% to the actual max distance in the cube, but the range 0-255 seems to be what a user would expect coming from other tools/software.

defart commented 8 years ago

Hi Anthony, sorry again for the late reply just wanted to discuss this further.

441 is the max distance if you take the RGB dimensions into consideration. Add the 4th Alpha channel to this and you will come to the max distance of 510, as in my original proposal.

255 * 255 * 3 ~= 441 * 441 255 * 255 * 4 ~= 510 * 510

Furthermore, by taking the average we lose slightly on the granularity, but the fact that it's not treating colors separately is not an issue. We should care only about the absolute distance from one color to another not in which direction goes (the distribution of distance across channels).

E.g. you do not care if a certain shade of yellow is distant from red the same as a certain shade of purple, both the yellow and the purple should fit within the tolerance

I still suggest using the correct distances, and scaling them from 0 - 255 for usage.

I'll write some code instead of just talking about it, during the week :)

anthonynsimon commented 8 years ago

No worries :)

Taking the average breaks the linearity of the scale and by not taking the average we still get absolute distances between colors. Specifically in your example, the distance from red to any of yellow or purple yields the same result even if we don't average (distance between red and yellow/purple is 255 in either case).

See image below (red dot is averaging the result, blue dot is not - the expected 255 in both yellow and purple distances):

results

About the max distance 441 vs 510, it depends on how it is calculated. In my previous comment I only counted RGB because on the current function, the Alpha delta is taken from each channel by only picking the one that causes the most difference (because we only care if it is or not within the tolerance, not by how much), instead of doing a 4D euclidean distance (in which case it would then be 510 as you mentioned and no averaging should be made).

Since it seems very irregular how color similarity is actually dealt with in terms of Alpha, why don't we set a few expectations so we know what is the right result? My observation from other tools:

Makes sense because if you have 0% Alpha, no matter the color, it's a clear visual boundary. But what happens if the Alpha delta is a gradient?

grad1 grad2

The above image is what Photoshop does, it won't 'gradually' flood the gradient until the distance is larger than the tolerance. Instead it just cuts off at the point in which the other color kicks in, regardless of the distance. I guess this makes sense when working with Alpha masks, but would it be better to actually 'gradually' flood the Alpha gradient? This is where other tools seems to vary the result.