JuliaImages / ImageSegmentation.jl

Partitioning images into meaningful regions
Other
47 stars 23 forks source link

Problem with Graph Construction from Segmented Image #94

Open niccolo99mandelli opened 1 year ago

niccolo99mandelli commented 1 year ago

Hi everyone, At the moment I’m working on a master thesis, which topic is connected with biomedical image processing. I am writing to you regarding an issue that I am experiencing with the JuliaImages library. The error occurs during the construction of a graph from a segmented image obtained by computing the watershed algorithm. My Julia code processes high-resolution histological images, whose size can range from 30 MB to 1.8 GB. The error that is generated is an Out Of Memory error and it occurs after about 10 minutes from the beginning of the graph construction process. I have 128 GB of memory + 8 GB of Swap available, is there a possible solution to overcome the problem and generate the graph without the risk of OOM? To try to reduce the chances of OOM, I modified the "region_adjacency_graph" function by removing the parts related to edge weights and consequently also the function for calculating weights. Nevertheless, the OOM scenario still occurs.

Below are the lines of code I use to segment the image and build the graph: img = ImageMagick.load_(svs_image) # load image bw = Gray.(img) .> 0.21 dist = 1 .- distance_transform(feature_transform(bw)) markers = label_components(dist .< -0.00001) segments = watershed(dist, markers) weight_fn(i,j) = euclidean(segment_pixel_count(segments,i), segment_pixel_count(segments,j)) #not used G, vert_map = region_adjacency_graph(segments)

For further information regarding the structure of the images to be segmented (svs_image), I am leaving the link to the package: https://github.com/niccolo99mandelli/JHistint.jl/tree/main/output_example. The image can be found in the "output_example" folder under the name "SlideExample.tif.

Thank you for your attention in advance. Niccolò Mandelli, University of Milano-Bicocca

ashwani-rathee commented 1 year ago

Hey @niccolo99mandelli, I'll take a look in evening where it might be erroring out and why. Thanks for letting us know!!

ashwani-rathee commented 1 year ago

@tlnagy Some issues noticed with using TiffImages:

julia> img = load("debug/SlideExample.tif")
[ Info: Precompiling TiffImages [731e570b-9d59-4bfa-96dc-6df516fadf69]
[ Info: Precompiling ImageMagick [6218d12a-5da1-5696-b52f-db25d2ecc6d1]
Errors encountered while load File{DataFormat{:TIFF}, String}("debug/SlideExample.tif").
All errors:
===========================================
BoundsError: attempt to access 0-element Vector{TiffImages.Tag} at index [1]
===========================================
OutOfMemoryError()
===========================================

Fatal error:
ERROR: BoundsError: attempt to access 0-element Vector{TiffImages.Tag} at index [1]
Stacktrace:
  [1] getindex
    @ .\array.jl:924 [inlined]
  [2] first(a::Vector{TiffImages.Tag})
    @ Base .\abstractarray.jl:404
  [3] getindex(ifd::TiffImages.IFD{UInt32}, key::UInt16)
    @ TiffImages C:\Users\lenono\.julia\packages\TiffImages\I0ilR\src\ifds.jl:69
  [4] getindex
    @ C:\Users\lenono\.julia\packages\TiffImages\I0ilR\src\ifds.jl:68 [inlined]
  [5] read!(target::Matrix{ColorTypes.RGB{FixedPointNumbers.N0f8}}, tf::TiffImages.TiffFile{UInt32, Stream{DataFormat{:TIFF}, IOStream, String}}, ifd::TiffImages.IFD{UInt32})
    @ TiffImages C:\Users\lenono\.julia\packages\TiffImages\I0ilR\src\ifds.jl:207
  [6] macro expansion
    @ C:\Users\lenono\.julia\packages\TiffImages\I0ilR\src\load.jl:86 [inlined]
  [7] macro expansion
    @ C:\Users\lenono\.julia\packages\ProgressMeter\sN2xr\src\ProgressMeter.jl:938 [inlined]
  [8] load(tf::TiffImages.TiffFile{UInt32, Stream{DataFormat{:TIFF}, IOStream, String}}, ifds::Vector{TiffImages.IFD{UInt32}}, N::Int64; verbose::Bool)
    @ TiffImages C:\Users\lenono\.julia\packages\TiffImages\I0ilR\src\load.jl:85
  [9] load(tf::TiffImages.TiffFile{UInt32, Stream{DataFormat{:TIFF}, IOStream, String}}; verbose::Bool, mmap::Bool, lazyio::Bool)
    @ TiffImages C:\Users\lenono\.julia\packages\TiffImages\I0ilR\src\load.jl:38
 [10] load(tf::TiffImages.TiffFile{UInt32, Stream{DataFormat{:TIFF}, IOStream, String}})
    @ TiffImages C:\Users\lenono\.julia\packages\TiffImages\I0ilR\src\load.jl:18
 [11] load(io::IOStream; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ TiffImages C:\Users\lenono\.julia\packages\TiffImages\I0ilR\src\load.jl:17
 [12] load
    @ C:\Users\lenono\.julia\packages\TiffImages\I0ilR\src\load.jl:17 [inlined]
 [13] #14
    @ C:\Users\lenono\.julia\packages\TiffImages\I0ilR\src\load.jl:13 [inlined]
 [14] open(::TiffImages.var"#14#15"{Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}}, ::String, ::Vararg{String}; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Base .\io.jl:384
 [15] open(::Function, ::String, ::String)
    @ Base .\io.jl:381
 [16] _safe_open(::Function, ::String, ::String; kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ TiffImages C:\Users\lenono\.julia\packages\TiffImages\I0ilR\src\utils.jl:200
 [17] _safe_open
    @ C:\Users\lenono\.julia\packages\TiffImages\I0ilR\src\utils.jl:197 [inlined]
 [18] #load#13
    @ C:\Users\lenono\.julia\packages\TiffImages\I0ilR\src\load.jl:12 [inlined]
 [19] load(filepath::String)
    @ TiffImages C:\Users\lenono\.julia\packages\TiffImages\I0ilR\src\load.jl:11
 [20] #invokelatest#2
    @ .\essentials.jl:729 [inlined]
 [21] invokelatest
    @ .\essentials.jl:726 [inlined]
 [22] #_#1
    @ C:\Users\lenono\.julia\packages\LazyModules\d9Be6\src\LazyModules.jl:29 [inlined]
 [23] (::LazyModules.LazyFunction)(args::String)
    @ LazyModules C:\Users\lenono\.julia\packages\LazyModules\d9Be6\src\LazyModules.jl:27
 [24] load(f::File{DataFormat{:TIFF}, String}; canonicalize::Nothing, kwargs::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ ImageIO C:\Users\lenono\.julia\packages\ImageIO\xMHN9\src\ImageIO.jl:111
 [25] load(f::File{DataFormat{:TIFF}, String})
    @ ImageIO C:\Users\lenono\.julia\packages\ImageIO\xMHN9\src\ImageIO.jl:109
 [26] #invokelatest#2
    @ .\essentials.jl:729 [inlined]
 [27] invokelatest
    @ .\essentials.jl:726 [inlined]
 [28] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Formatted; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ FileIO C:\Users\lenono\.julia\packages\FileIO\BE7iZ\src\loadsave.jl:219
 [29] action
    @ C:\Users\lenono\.julia\packages\FileIO\BE7iZ\src\loadsave.jl:196 [inlined]
 [30] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Symbol, ::String; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ FileIO C:\Users\lenono\.julia\packages\FileIO\BE7iZ\src\loadsave.jl:185
 [31] action
    @ C:\Users\lenono\.julia\packages\FileIO\BE7iZ\src\loadsave.jl:185 [inlined]
 [32] load(::String; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ FileIO C:\Users\lenono\.julia\packages\FileIO\BE7iZ\src\loadsave.jl:113
 [33] load(::String)
    @ FileIO C:\Users\lenono\.julia\packages\FileIO\BE7iZ\src\loadsave.jl:109
 [34] top-level scope
    @ REPL[6]:1
Stacktrace:
 [1] handle_error(e::BoundsError, q::Base.PkgId, bt::Vector{Union{Ptr{Nothing}, Base.InterpreterIP}})
   @ FileIO C:\Users\lenono\.julia\packages\FileIO\BE7iZ\src\error_handling.jl:61
 [2] handle_exceptions(exceptions::Vector{Tuple{Any, Union{Base.PkgId, Module}, Vector}}, action::String)
   @ FileIO C:\Users\lenono\.julia\packages\FileIO\BE7iZ\src\error_handling.jl:56
 [3] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Formatted; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})   
   @ FileIO C:\Users\lenono\.julia\packages\FileIO\BE7iZ\src\loadsave.jl:228
 [4] action
   @ C:\Users\lenono\.julia\packages\FileIO\BE7iZ\src\loadsave.jl:196 [inlined]
 [5] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Symbol, ::String; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ FileIO C:\Users\lenono\.julia\packages\FileIO\BE7iZ\src\loadsave.jl:185
 [6] action
   @ C:\Users\lenono\.julia\packages\FileIO\BE7iZ\src\loadsave.jl:185 [inlined]
 [7] load(::String; options::Base.Pairs{Symbol, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ FileIO C:\Users\lenono\.julia\packages\FileIO\BE7iZ\src\loadsave.jl:113
 [8] load(::String)
   @ FileIO C:\Users\lenono\.julia\packages\FileIO\BE7iZ\src\loadsave.jl:109
 [9] top-level scope
   @ REPL[6]:1

I am able to see it on my laptop though Also same for ImageMagick:

julia> using ImageMagick

julia> img = ImageMagick.load("debug\\SlideExample.tif")
ERROR: OutOfMemoryError()
Stacktrace:
  [1] Array
    @ .\boot.jl:463 [inlined]
  [2] Array
    @ .\boot.jl:470 [inlined]
  [3] Array
    @ .\boot.jl:476 [inlined]
  [4] _collect_indices
    @ .\array.jl:730 [inlined]
  [5] collect(A::PermutedDimsArray{ColorTypes.RGB{FixedPointNumbers.N0f8}, 3, (2, 1, 3), (2, 1, 3), Array{ColorTypes.RGB{FixedPointNumbers.N0f8}, 3}})
    @ Base .\array.jl:714
  [6] load_(file::String, permute_horizontal::Bool; ImageType::Type, extraprop::String, extrapropertynames::Nothing, view::Bool)
    @ ImageMagick C:\Users\lenono\.julia\packages\ImageMagick\Fh2BX\src\ImageMagick.jl:157
  [7] load_ (repeats 2 times)
    @ C:\Users\lenono\.julia\packages\ImageMagick\b8swT\src\ImageMagick.jl:138 [inlined]
  [8] #load#30
    @ C:\Users\lenono\.julia\packages\ImageMagick\b8swT\src\ImageMagick.jl:126 [inlined]
  [9] load(::String)
    @ ImageMagick C:\Users\lenono\.julia\packages\ImageMagick\Fh2BX\src\ImageMagick.jl:125
 [10] top-level scope
    @ REPL[10]:1

@niccolo99mandelli issue is related to Image/Image reading more than the algorithm I think or I think it's limitation of my laptop, given tests pass. Size of image is actually quite big like 30k*10k

ashwani-rathee commented 1 year ago

MWE:

using FileIO, ImageSegmentation, ImageMorphology, ImageDistances, ImageCore, TestImages, ImageTransformations

img = testimage("mandrill")
# img = imresize(img, (30000,10000))

# img = load("SlideExample.tif")
bw = Gray.(img) .> 0.21
dist = 1 .- distance_transform(feature_transform(bw))
markers = label_components(dist .< -0.00001)
segments = watershed(dist, markers)
weight_fn(i,j) = euclidean(segment_pixel_count(segments,i), segment_pixel_count(segments,j)) #not used
G, vert_map = region_adjacency_graph(segments, weight_fn)
tlnagy commented 1 year ago

Do you mind posting a minimal working example (with an example tiff) over at tlnagy/TiffImages.jl so I can diagnose the bug and keep track of it?

niccolo99mandelli commented 1 year ago

The image reading error of the .TIF file is due to the image size, as it also occurs on my laptop for large images. However, it works for small images such as the "example.tif" image, which is contained in the same folder of the GitHub repository. To access the image locally, I use the following two lines:

link = joinpath(@DIR, "..", "output_example", "example.tif") svs_image = read(link) ...

Testing the script on a server with 128 GB + 8 GB of Swap, the image loading works for both small and large images. Therefore, I assume that the error is dependent on the amount of available memory. However, the server does not seem to be sufficient to generate the graph with the corresponding "region_adjacency_graph" function. Would it be possible, for example, to divide the segments into multiple parts and build the graph step-by-step ?

timholy commented 1 year ago

A 1.8GB image is not that bad given your hardware. For really big images, load(filename, mmap=true) with TiffImages is highly recommended, but I don't think that's required for this application.

Your OOM is presumably in region_adjacency_graph and not the IO? My guess is you have too many small regions. Have you looked at markers in, say, ImageView? Something like this:

# Visualize each "component" in `markers` with a random color
ncolors = maximum(markers)
randcolors = rand(RGB{N0f8}, ncolors)
using IndirectArrays
imgi = IndirectArray(markers, randcolors)
imshow(imgi)

(untested)

If there are more components than you want, an obvious solution is to change the threshold on dist.

niccolo99mandelli commented 1 year ago

I tried to modify the parameter to change the size of the markers but the OOM error occurs again. Specifically, I set values like 0.3 and 0.9, the total computation time decreases but the OOM still occurs after a few minutes from the start of the graph construction process. I have explored the code related to the region_adjacency_graph function reported below:

function region_adjacency_graph(s::SegmentedImage)
    function neighbor_regions!(n::Set{Int}, visited::AbstractArray, s::SegmentedImage, I::CartesianIndex)
        R = CartesianIndices(axes(s.image_indexmap))
        I1 = _oneunit(CartesianIndex{ndims(visited)})
        Ibegin, Iend = first(R), last(R)
        t = Vector{CartesianIndex{ndims(visited)}}()
        push!(t, I)
        while !isempty(t)
            temp = pop!(t)
            visited[temp] = true
            for J in _colon(max(Ibegin, temp-I1), min(Iend, temp+I1))
                if s.image_indexmap[temp] != s.image_indexmap[J]
                    push!(n,s.image_indexmap[J])
                elseif !visited[J]
                    push!(t,J)
                end
            end
        end
        n
    end

    visited  = fill(false, axes(s.image_indexmap))                           # Array to mark the pixels that are already visited
    G        = SimpleWeightedGraph()                                         # The region_adjacency_graph
    vert_map = Dict{Int,Int}()                                               # Map that stores (label, vertex) pairs
    # add vertices to graph
    Graphs.add_vertices!(G,length(s.segment_labels))
    # setup `vert_map`
    for (i,l) in enumerate(s.segment_labels)
        vert_map[l] = i
    end
    # add edges to graph
    for p in CartesianIndices(axes(s.image_indexmap))
        if !visited[p]
            n = Set{Int}()
            try
                neighbor_regions!(n, visited, s, p)
        GC.gc()
            catch oom
                if isa(oom, OutOfMemoryError)
                    n = Set{Int}()
                    GC.gc()
                    println(">>> OOM")
                    show(p)
                    show(s)
                    exit()
                end
            end
            for i in n
                Graphs.add_edge!(G, vert_map[s.image_indexmap[p]], vert_map[i])
            end  
        end
    end
    G, vert_map
end

What data type is s::SegmentedImage, and what is the utility of image_indexmap? Additionally, I am unclear about the purpose of the neighbor_regions function and whether it is used to calculate connected components within the graph. If so, are there any other possibilities to calculate them, such as using structured graph-based functions like union ... ? NOTE: The edge weights are not included in the functions.

timholy commented 1 year ago

Presumably a key question is how many regions you have. It's been ages since I've thought about this code, but a quick read suggests the algorithm may be O(ncomponents^2*N) where N is the number of pixels. If so, what's the value of ncolors in my demo above? That is absolutely critical information, and you should be using that to decide when you've set the threshold on dist appropriately.

What data type is s::SegmentedImage

https://juliaimages.org/latest/pkgs/segmentation/ and particularly https://juliaimages.org/stable/pkgs/segmentation/#Result-1. (Be sure to study the demo.) It's also defined in the package code, https://github.com/JuliaImages/ImageSegmentation.jl/blob/329458547b98784a10d7b91bf9ea6a25e5774f1a/src/core.jl#L17-L26

I am unclear about the purpose of the neighbor_regions function

It's an inner function, see the Julia docs (short version: can only be called from the outer function)

whether it is used to calculate connected components

It's used to calculate the graph. If you just want connected components then markers already has that information. You only need the RAG if you're looking to merge components together.

Keep in mind that this is deliberately implementing a standard data structure & algorithm, but there are more efficient possibilities given that you're starting from a densely-connected segmentation.

marcoxa commented 1 year ago

Hi @timholy let me jump in as I am the culprit for Niccolò's quandary.

from your comment, it is unclear to me what are the connected components in the markers.

Could you explain a bit in more details what are the fields of SegmentedImage ? Especially the segment_indexmap.

Also isn't the RAG essentially an aggregate that is computed by setting up a separate connected component structure?

I am trying to grok all of this, to see whether we can handle the OOM error.

All the best

Marco

timholy commented 1 year ago

Has anyone actually followed the advice and visually inspected markers as I suggested? I think a lot will be obvious if you do---you'll have a much clearer understanding of what you're dealing with at a level more than symbols in code.

niccolo99mandelli commented 1 year ago

Yes, I had already verified and inspected the markers by varying the threshold set for dist. When I increased the threshold, I noticed an increase in the size of the markers in the segmented image, which resulted in a lower number of labels being recorded. The OOM error persisted even when setting a high threshold. For example, after setting an high threshold (0.8 instead of 0.000001) in the demo, the variable "ncolors" resulted in 46.

I understand the issue with the markers and their composition, but I am still unsure how to set the threshold so that each marker is associated with a cell. The desired output would be to generate the segmented image in which each cell has an associated marker and therefore a different coloration. The only possibility to set the threshold is to manually inspect the image and set it accordingly, or is there a method to set it based on the output given by the grayscale filter?

timholy commented 1 year ago

Usually interactive exploration is necessary to set critical parameters like this one. There may be new AI tools that can help do the training, but historically what most people tend to do (if you're doing this kind of thing a lot and need to set the threshold individually) is set up a GUI that will update the visualization of the segmented image as you change the value. Both Makie.jl and GtkObservables.jl should be able to do this very easily.