JuliaImages / ImageDistances.jl

Distances between N-dimensional images
https://github.com/JuliaImages/Images.jl
Other
15 stars 8 forks source link

OutOfMemoryError with Modified Hausdorff #9

Closed ammilten closed 5 years ago

ammilten commented 5 years ago

I am trying to compute a distance matrix using the Modified Hausdorff distance (Julia 0.6.4) and running into a confusing OutOfMemoryError. Basically, I am trying to run this code to compute distances between 300x260 pixel images. See the attached file (change the extension) for the full script to reproduce the error. outOfMemExample.txt

using Distances
using ImageDistances

d = ModifiedHausdorff()
distmat = pairwise(d, [images[:,:,i] for i in 1:3])  # images is a [300 x 260 x 3] array of Float64's 

When I run it, I get this

ERROR: LoadError: OutOfMemoryError()
Stacktrace:
 [1] pairwise(::Distances.Euclidean, ::Array{Int64,2}, ::Array{Int64,2}) at /home/ammilten/.julia/v0.6/Distances/src/generic.jl:120
 [2] evaluate_pset(::ImageDistances.Hausdorff{ImageDistances.MinReduction,ImageDistances.MaxReduction}, ::Array{Int64,2}, ::Array{Int64,2}) at /home/ammilten/.julia/v0.6/ImageDistances/src/hausdorff.jl:50
 [3] pairwise(::ImageDistances.Hausdorff{ImageDistances.MinReduction,ImageDistances.MaxReduction}, ::Array{Array{Float64,2},1}) at /home/ammilten/.julia/v0.6/ImageDistances/src/hausdorff.jl:109
 [4] include_from_node1(::String) at ./loading.jl:576
 [5] include(::String) at ./sysimg.jl:14
 [6] process_options(::Base.JLOptions) at ./client.jl:305
 [7] _start() at ./client.jl:371
while loading /home/ammilten/Programs/stratsim/src/training/outOfMemExample.jl, in expression starting on line 27

I don't understand why I am running out of memory. A few distances are computed in a reasonable time but it fails on the third one. Interestingly, looping over and using evaluate(d, images[:,:,i], images[:,:,j]) gives the same error on the third distance. It also doesn't matter if I use distmat = pairwise(d, [images[:,:,i] for i in 2:5]) it still fails on the third distance. The error seems to originate in the Distances.jl library so I may also post this issue there.

Is there some sort of memory leak going on, or am I misusing the function?

juliohm commented 5 years ago

Hi @ammilten it may have to do with the number of black pixels in the image. If you have a lot of black pixels (i.e. active points) in the image, it will become a huge point cloud. That is why we usually apply the Hausdorff distance after filtering the images with an edge detector. Can you elaborate on the images you have?

Also, consider updating to Julia v1.0 if you can, the language is faster already and the package has been updated.

ammilten commented 5 years ago

The images are from your GeoStatsImages library (FlumeBinary), which I performed edge detection on first. The point clouds are probably huge, but I am not sure how to make them smaller. Perhaps I can do a wavelet decomposition or something before passing the images to the edge detector.

But I still don't understand why I would run out of memory when computing the distance in a loop? I can compute any distance individually but when iterating I reach the memory limit after a few iterations.

juliohm commented 5 years ago

What you can do to debug this further, is concentrate on the image that causes OutOfMemoryError. What if you count the number of black pixels in it, let's say N, and try to create a matrix of size 2 x N? If you can nail down the error to something more specific, we can try figure out if it is really a problem of memory or something else :+1:

ammilten commented 5 years ago

When evaluating this

d = ModifiedHausdorff()
println(evaluate(d, edges[:,:,1], edges[:,:,2]))
println(evaluate(d, edges[:,:,1], edges[:,:,3]))
println(evaluate(d, edges[:,:,2], edges[:,:,3]))
println(evaluate(d, edges[:,:,1], edges[:,:,4]))
println(evaluate(d, edges[:,:,2], edges[:,:,4]))

the first 2 work and I get the error on the third.

2.2203511881663665
3.0485483095083143
ERROR: LoadError: OutOfMemoryError()
Stacktrace:
 [1] pairwise(::Distances.Euclidean, ::Array{Int64,2}, ::Array{Int64,2}) at /home/ammilten/.julia/v0.6/Distances/src/generic.jl:120
 [2] evaluate_pset(::ImageDistances.Hausdorff{ImageDistances.MeanReduction,ImageDistances.MaxReduction}, ::Array{Int64,2}, ::Array{Int64,2}) at /home/ammilten/.julia/v0.6/ImageDistances/src/hausdorff.jl:50
 [3] evaluate(::ImageDistances.Hausdorff{ImageDistances.MeanReduction,ImageDistances.MaxReduction}, ::Array{Float64,2}, ::Array{Float64,2}) at /home/ammilten/.julia/v0.6/ImageDistances/src/hausdorff.jl:58
 [4] include_from_node1(::String) at ./loading.jl:576
 [5] include(::String) at ./sysimg.jl:14
 [6] process_options(::Base.JLOptions) at ./client.jl:305
 [7] _start() at ./client.jl:371
while loading /home/ammilten/Programs/stratsim/src/training/outOfMemExample.jl, in expression starting on line 42

If the third one was the problem then it should still fail if I comment out the first 2 lines.

d = ModifiedHausdorff()
#println(evaluate(d, edges[:,:,1], edges[:,:,2]))
#println(evaluate(d, edges[:,:,1], edges[:,:,3]))
println(evaluate(d, edges[:,:,2], edges[:,:,3]))
println(evaluate(d, edges[:,:,1], edges[:,:,4]))
println(evaluate(d, edges[:,:,2], edges[:,:,4]))

However, I still get 2 successes and a OutOfMemoryError() on the third

2.3271609121705414
3.814361829335579
ERROR: LoadError: OutOfMemoryError()
Stacktrace:
 [1] pairwise(::Distances.Euclidean, ::Array{Int64,2}, ::Array{Int64,2}) at /home/ammilten/.julia/v0.6/Distances/src/generic.jl:120
 [2] evaluate_pset(::ImageDistances.Hausdorff{ImageDistances.MeanReduction,ImageDistances.MaxReduction}, ::Array{Int64,2}, ::Array{Int64,2}) at /home/ammilten/.julia/v0.6/ImageDistances/src/hausdorff.jl:50
 [3] evaluate(::ImageDistances.Hausdorff{ImageDistances.MeanReduction,ImageDistances.MaxReduction}, ::Array{Float64,2}, ::Array{Float64,2}) at /home/ammilten/.julia/v0.6/ImageDistances/src/hausdorff.jl:58
 [4] include_from_node1(::String) at ./loading.jl:576
 [5] include(::String) at ./sysimg.jl:14
 [6] process_options(::Base.JLOptions) at ./client.jl:305
 [7] _start() at ./client.jl:371
while loading /home/ammilten/Programs/stratsim/src/training/outOfMemExample.jl, in expression starting on line 44

So, it really doesn't seem to be the images.

juliohm commented 5 years ago

That seems quite random. Can you please confirm that your script only contains these lines? To me it sounds like something is happening before you compute the distances, something that is already consuming a lot of your memory, and then no matter what images you select later, you only have space left for 2 more distance calculations?

If you can produce a minimum working example by just loading the images and nothing else, that would help debug it further.

ammilten commented 5 years ago

Here is the full script I use and I get the same behavior

using JLD
using Distances
using ImageDistances

edges = load("edges.jld")["edges"]

d=ModifiedHausdorff()
println(evaluate(d, edges[:,:,1], edges[:,:,2]))
println(evaluate(d, edges[:,:,1], edges[:,:,3]))
println(evaluate(d, edges[:,:,2], edges[:,:,3]))
println(evaluate(d, edges[:,:,1], edges[:,:,4]))
println(evaluate(d, edges[:,:,2], edges[:,:,4]))

So weird. I'm not even saving anything if I'm printing to the terminal, so I don't know what could possibly be using up all the memory.

Here's the edges.jld file edges.zip

juliohm commented 5 years ago

If you can share the JLD file i can try reproduce the issue on my machine tomorrow.

On Thu, Dec 20, 2018, 22:08 Alex Miltenberger <notifications@github.com wrote:

Here is the full script I use and I get the same behavior

using JLD using Distances using ImageDistances

edges = load("edges.jld")["edges"]

d=ModifiedHausdorff() println(evaluate(d, edges[:,:,1], edges[:,:,2])) println(evaluate(d, edges[:,:,1], edges[:,:,3])) println(evaluate(d, edges[:,:,2], edges[:,:,3])) println(evaluate(d, edges[:,:,1], edges[:,:,4])) println(evaluate(d, edges[:,:,2], edges[:,:,4]))

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/JuliaImages/ImageDistances.jl/issues/9#issuecomment-449183451, or mute the thread https://github.com/notifications/unsubscribe-auth/ADMLbRrDW6QAmuaEcJOGIG0cO7aA0Sndks5u7CaZgaJpZM4Zc_hv .

ammilten commented 5 years ago

Sure, you should be able to get it on the edited comment above

juliohm commented 5 years ago

I cannot reproduce the issue. Below you can you find my system information:

julia> versioninfo()
Julia Version 1.0.0
Commit 5d4eaca0c9 (2018-08-08 20:58 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-6500U CPU @ 2.50GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)

I have 8GB of RAM in this laptop. I suggest that you first update your workflow to Julia v1.0 if you want to debug this further.

ammilten commented 5 years ago

Interesting. I would love to upgrade to Julia v1.0 but unfortunately for this project I am using some libraries that have not been maintained through v1.0. Regardless, I have implemented a workaround using wavelet decomposition to make the images smaller before computing this distances. The function works very fast on the small images as expected.

I will close the issue for now, but thank you for the help!