JuliaImages / ImageDistances.jl

Distances between N-dimensional images
https://github.com/JuliaImages/Images.jl
Other
15 stars 8 forks source link

Optimized Hausdorff and ModifiedHausdorff algorithms (fixes #56). #57

Closed jsenn closed 3 years ago

jsenn commented 3 years ago

Previously, the Hausdorff routines used time and memory that was quadratic in the number of active pixels in the input images. This made them impractical for large or dense images (see #56).

Here, we leverage the distance_transform function from ImageMorphology to compute the Hausdorff distance in linear time.

Limitations

I'm new to Julia, so I apologize in advance for any mistakes in this PR. I'd appreciate any comments or criticisms you may have!

juliohm commented 3 years ago

Thank you @jsenn for the PR! I'd would like to suggest a few changes before I start a review on the GitHub system. I will use your nice implementation shared in #57:

function hausdorff(A::BitArray{2}, B::BitArray{2})
    dtA = distance_transform(feature_transform(A))
    dtB = distance_transform(feature_transform(B))
    dAB = maximum(A .* dtB)
    dBA = maximum(B .* dtA)
    return max(dAB, dBA)
end

We see that this implementation starts with a preprocessing step feature_transform |> distance_transform. Then the result of this preprocessing step is combined with the original input A and B to produce the one-way Hausdorff distances. Based on this observation, I would like to suggest the following:

  1. Replace the img2pset function by hausdorff_transform = feature_transform |> distance_transform
  2. Replace evaluate_pset by the combination step above. Maybe call this function hausdorff_evaluate(d, A, B, dtA, dtB)
  3. Rewrite the rest of the logic so that occurrences of img2pset are replaced by hausdorff_transform and occurrences of evaluate_pset are replaced by the hausdorff_evaluate.

The reason why there is additional logic is because we can save time by executing the preprocessing step once before the loops in pairwise and colwise for example. Does it make sense to you? Please let me know if something is not clear.

Also notice that the implementation of this new hausdorff_evaluate will rely on d.inner_op instead of the maximum in your code snippet above.

This PR will lead to a tremendous improvement in performance from some quick tests on the REPL 🚀 :)

jsenn commented 3 years ago

@juliohm I made the changes you suggested. Let me know if there's something that I missed!

jsenn commented 3 years ago

@juliohm thanks for the comments. I made the changes, and the tests still pass on my machine.

juliohm commented 3 years ago

That is great @jsenn , thanks for incorporating the suggestions. Thank you @johnnychen94 for the helpful comments.

I think we only miss the benchmarks now. Can you share some examples with a couple of sparse and dense binary images before and after the change?

jsenn commented 3 years ago

@juliohm As mentioned in the issue (#56), the original implementation runs out of memory on my machine for even a small image (640x480) with a density of 0.5. It takes the original as long to diff 2 images at 125x125 resolution as it takes the new one to diff 2 4000x2000 images, again at 0.5 density.

I used the following test code to look at other densities:

using ImageDistances, Random, BenchmarkTools

make_test_img(w, h, density) = map((x -> x > density ? 0 : 1), rand(w, h))

function do_test(w, h, density)
    imgA = make_test_img(w, h, density)
    imgB = make_test_img(w, h, density)
    stats = @benchmark hausdorff($imgA, $imgB)
    mean(stats).time / 1e6
end

function do_tests(resolutions, densities)
    # warm up the function
    do_test(100, 100, 0.5)
    results = zeros(length(resolutions), length(densities))
    for res_idx = 1:length(resolutions)
        res = resolutions[res_idx]
        for dens_idx = 1:length(densities)
            dens = densities[dens_idx]
            try
                avg_time = do_test(res[1], res[2], dens)
                results[res_idx, dens_idx] = avg_time
            catch e
                if e isa OutOfMemoryError
                    results[res_idx, dens_idx] = Inf
                else
                    display(e)
                end
            end
        end
    end
    display(results)
end

resolutions = [(100,100), (640,480), (1028,768), (3840, 2160)]
densities = [0.0, 0.0001, 0.001, 0.01, 0.1, 0.5, 1.0]
do_tests(resolutions, densities)

Below are some tables of results. The rows of the table represent image dimensions, and the columns various settings for the density, from 0 (empty image) to 1 (all active pixels).

Here's the table output from the original algorithm (times are given in milliseconds):   0 0.0001 0.001 0.01 0.1 0.5 1
100x100 0.0669086 0.0529377 0.0519164 0.189756 14.9962 457.341 1.23418
640x480 2.09205 1.46759 2.38225 157.231 45218.6 Inf 28.9543
1028x768 5.36087 3.77945 11.7178 1141.55 Inf Inf 66.5488
3840x2160 51.5246 44.911 1251.15 Inf Inf Inf 887.691

Inf values indicate that the program ran out of memory.

And here's the table from the new algorithm:   0 0.0001 0.001 0.01 0.1 0.5 1
100x100 0.331425 0.327105 0.498366 0.781181 1.14899 1.55493 0.852574
640x480 10.1094 16.3752 24.7114 49.5895 43.6246 54.8624 34.5993
1028x768 24.6157 42.526 72.8077 96.6227 111.602 140.856 85.7321
3840x2160 439.393 984.964 1746.75 1756.3 1799.32 2081.8 1640.28
Finally, here's a table where each cell is new / old. It shows the improvement/regression of new over old (close to 0 indicates significant improvement, 1 indicates same performance, >1 indicates regression):   0 0.0001 0.001 0.01 0.1 0.5 1
100x100 4.953 6.179 9.599 4.117 0.077 0.003 0.691
640x480 4.832 11.158 10.373 0.315 0.001 #VALUE! 1.195
1028x768 4.592 11.252 6.213 0.085 #VALUE! #VALUE! 1.288
3840x2160 8.528 21.931 1.396 #VALUE! #VALUE! #VALUE! 1.848

Some notes:

If the regressions are a problem, it might be enough to just special-case images with density < 0.01, falling back on the old algorithm. Special-casing empty and full images might also be a good idea, since the distance from any empty image

juliohm commented 3 years ago

@jsenn am I seeing something incorrectly? Why the new algorithm is consistently slower than the old? There are very few entries in the table with values smaller than 1?

Appreciate if you can clarify because this table doesn't justify the PR, it actually goes against it.

jsenn commented 3 years ago

@juliohm note the columns in the table: most of the entries are for very sparse images. If even 1% of the pixels in the image are active, the new algorithm vastly outperforms the previous one (for example, it is 1,000x faster for 640x480 images with 10% of pixels active). For that matter, the distance transform algorithm is infinitely faster than the previous one for images with moderate to high densities, or for large images, because the previous algorithm runs out of memory. (EDIT: these cases where the original algorithm doesn't return a result show up in the comparison table as #VALUE! entries).

I intentionally chose to test mostly sparse images in order to show the weaknesses of the new implementation, because the strengths were clear to me:

Here are some simple improvements that will take care of some of the performance regressions:

The only thing I'm not sure about is how to handle sparse but nonempty images. One idea would be to just add separate hausdorff_sparse and modified_hausdorff_sparse functions that use the old algorithm. Another would be to add some kind of heuristic to choose whether to use the distance_transform algorithm or the one that compares all the points. The extremely steep cutoff in performance at 1% density suggests a way to do this: simply have a preprocessing step that calculates the images' densities, and if it's less than 1%, use the old algorithm. Otherwise, use the new one.

jsenn commented 3 years ago

also, I think the absolute differences are important. For example, the original algorithm is 10x faster for 640x480 images with 0.1% density. However, that represents a time savings of only about 20 milliseconds. In fact, the greatest time saving you get from using the original algorithm is about 1 second, and that's in a case that's easily optimized in the new implementation (diffing 2 4k images with all 1s).

Where the new algorithm outperforms the old, however, the time savings are often measured in seconds, minutes, and, again, infinities (of course, if you have tons of memory, those infinities may become hours).

johnnychen94 commented 3 years ago

A really great benchmark and explanation. I'm good with the benchmark result given that most "actual" images lie in the density range (0.01-1) where the new algorithm is faster.

julia> using TestImages, ImageBinarization

julia> densities = map(TestImages.remotefiles) do file
       try
           img = RGB.(testimage(file))
           out = binarize(img, Otsu())
           return sum(x->x==1, out)/length(out)
       catch
           return missing
       end
   end

julia> describe(DataFrame(densities=densities))
1×7 DataFrame
 Row │ variable   mean      min        median    max       nmissing  eltype
     │ Symbol     Float64   Float64    Float64   Float64   Int64     Union
─────┼───────────────────────────────────────────────────────────────────────────────────────
   1 │ densities  0.448454  0.0105944  0.500463  0.768402         1  Union{Missing, Float64}

binarize(img, Otsu()) might be too brutal in real-world image processing, but it does give a hint here.

It would be nice if there's a better way to fix the performance regression in this PR, though, I'm okay to merge this first and fix it later as slower is definitely better than unaccessible, i.e., OutOfMemoryError.

This is just my personal opinion, I'll leave it to @juliohm to decide if it's mergeable.

juliohm commented 3 years ago

Thank you guys for the detailed benchmark results and tests, I will just do a quick check with a set of images I did use when I wrote this original version of the code to make sure there is no major regression as suggested by the test images and random images above. I am also in favor to merge, just a quick check to see if existing users will feel any major problems.

juliohm commented 3 years ago

Oh yeah, everything is working nicely with the images locally too. Amazing work @jsenn ! 💯

juliohm commented 3 years ago

Feel free to trigger a new patch release @johnnychen94 if you feel it is a good time.

jsenn commented 3 years ago

@juliohm @johnnychen94 thanks for the careful reviews! @johnnychen94 I didn't know about TestImages--that would have made a more interesting benchmark than the random images (though as you note it wouldn't have shown the weaknesses of the new algorithm).