tlnagy / OMETIFF.jl

I/O operations for OME-TIFF files in Julia
Other
24 stars 7 forks source link

Fix multi-image BoundsError #94

Closed HsupoLeng closed 2 years ago

HsupoLeng commented 2 years ago

Hi there,

I recently started using OMETIFF for dealing with my microscope recordings in Julia. The package worked great for most of my data, but failed in the case where I collected multiple cycles of time series. Each cycle share the same imaging parameters, and were collected as automatic repetitions over the same FOV. As a neuroscientist, I perform this type of acquisition as I change the experiment condition for each cycle, so I can ask the effect of experiment condition on the neurons in the FOV. This should be a fairly general user case.

Back to the package, I looked into the code and seemed to have found an one-line bug with potentially broad impact.

Error

When there are multiple image nodes in the same ome.tiff, calling load() results in BoundsError. See error message below.

Root cause

When there are multiple image nodes (containers) in the dataset, the function ifdindex! is called for the corresponding number of times. ifd_indices and two other outputs are shared across ifdindex! calls, but in each call, the indices ifd and prev_ifd always restart from 1 and 0. As a result, the outputs have the size equal to only one container; information from later image nodes overwrites earlier ones. At the same time, the last index in ifd_indices (pos_idx) increments to track container/Image ID. When ifd_indices were later used to access an one-container-sized array in function inmemoryarray, BoundsError is triggered.

Fix

Initialize prev_ifd in ifdindex! as the length of obs_filepaths, to reflect the number of existing ifds.

Test

The fixed version passes all existing tests in runtests.jl. I have provided a folder of dummy images that should allow the error to be reproduced here, but have not included them in a test, because of the relatively large number of images.

julia> imgmeta = load(raw"TSeries_camp-000-ometiff_test\TSeries_camp-000_Cycle00001_Ch1_000001.ome.tif")
Error encountered while load File{DataFormat{:OMETIFF}, String}("TSeries_camp-000-ometiff_test\\TSeries_camp-000_Cycle00001_Ch1_000001.ome.tif").

Fatal error:
ERROR: BoundsError: attempt to access 512×512×1×2×11×1 Array{Gray{N0f16},6} with eltype ColorTypes.Gray{FixedPointNumbers.N0f16} at index [1:512, 1:512, 1, 1, 1, 3]
Stacktrace:
  [1] throw_boundserror(A::Array{ColorTypes.Gray{FixedPointNumbers.N0f16}, 6}, I::Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}, Int64, Int64, Int64, Int64})
    @ Base .\abstractarray.jl:651
  [2] checkbounds
    @ .\abstractarray.jl:616 [inlined]
  [3] view(::Array{ColorTypes.Gray{FixedPointNumbers.N0f16}, 6}, ::Function, ::Function, ::Int64, ::Int64, ::Int64, ::Int64)
    @ Base .\subarray.jl:177
  [4] maybeview(::Array{ColorTypes.Gray{FixedPointNumbers.N0f16}, 6}, ::Function, ::Function, ::Vararg{Any, N} where N)
    @ Base .\views.jl:133
  [5] dotview(::Array{ColorTypes.Gray{FixedPointNumbers.N0f16}, 6}, ::Function, ::Function, ::Vararg{Any, N} where N)
    @ Base.Broadcast .\broadcast.jl:1212
  [6] macro expansion
    @ D:\.julia\dev\OMETIFF\src\loader.jl:136 [inlined]
  [7] macro expansion
    @ D:\.julia\packages\ProgressMeter\Vf8un\src\ProgressMeter.jl:940 [inlined]
  [8] inmemoryarray(ifds::OrderedCollections.OrderedDict{NTuple{4, Int64}, Tuple{TiffImages.TiffFile{UInt32, S} where S<:Stream, TiffImages.IFD{UInt32}}}, dims::NamedTuple{(:Y, :X, :Z, :C, :T, :P), NTuple{6, Int64}}; verbose::Bool)
    @ OMETIFF D:\.julia\dev\OMETIFF\src\loader.jl:134
  [9] load(io::Stream{DataFormat{:OMETIFF}, IOStream, String}; dropunused::Bool, verbose::Bool, inmemory::Bool)
    @ OMETIFF D:\.julia\dev\OMETIFF\src\loader.jl:94
 [10] #31
    @ D:\.julia\dev\OMETIFF\src\loader.jl:3 [inlined]
 [11] open(f::OMETIFF.var"#31#32"{Bool, Bool}, args::File{DataFormat{:OMETIFF}, String}; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ Base .\io.jl:330
 [12] open
    @ .\io.jl:328 [inlined]
 [13] #load#30
    @ D:\.julia\dev\OMETIFF\src\loader.jl:2 [inlined]
 [14] load(f::File{DataFormat{:OMETIFF}, String})
    @ OMETIFF D:\.julia\dev\OMETIFF\src\loader.jl:2
 [15] #invokelatest#2
    @ .\essentials.jl:708 [inlined]
 [16] invokelatest
    @ .\essentials.jl:706 [inlined]
 [17] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Formatted; options::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ FileIO D:\.julia\packages\FileIO\QkYgA\src\loadsave.jl:219
 [18] action
    @ D:\.julia\packages\FileIO\QkYgA\src\loadsave.jl:197 [inlined]
 [19] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Symbol, ::String; options::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ FileIO D:\.julia\packages\FileIO\QkYgA\src\loadsave.jl:185
 [20] action
    @ D:\.julia\packages\FileIO\QkYgA\src\loadsave.jl:185 [inlined]
 [21] load(::String; options::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
    @ FileIO D:\.julia\packages\FileIO\QkYgA\src\loadsave.jl:113
 [22] load(::String)
    @ FileIO D:\.julia\packages\FileIO\QkYgA\src\loadsave.jl:110
 [23] top-level scope
    @ REPL[10]:1
Stacktrace:
 [1] handle_error(e::BoundsError, q::Base.PkgId, bt::Vector{Union{Ptr{Nothing}, Base.InterpreterIP}})
   @ FileIO D:\.julia\packages\FileIO\QkYgA\src\error_handling.jl:61
 [2] handle_exceptions(exceptions::Vector{Tuple{Any, Union{Base.PkgId, Module}, Vector{T} where T}}, action::String)
   @ FileIO D:\.julia\packages\FileIO\QkYgA\src\error_handling.jl:56
 [3] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Formatted; options::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ FileIO D:\.julia\packages\FileIO\QkYgA\src\loadsave.jl:228
 [4] action
   @ D:\.julia\packages\FileIO\QkYgA\src\loadsave.jl:197 [inlined]
 [5] action(::Symbol, ::Vector{Union{Base.PkgId, Module}}, ::Symbol, ::String; options::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ FileIO D:\.julia\packages\FileIO\QkYgA\src\loadsave.jl:185
 [6] action
   @ D:\.julia\packages\FileIO\QkYgA\src\loadsave.jl:185 [inlined]
 [7] load(::String; options::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})
   @ FileIO D:\.julia\packages\FileIO\QkYgA\src\loadsave.jl:113
 [8] load(::String)
   @ FileIO D:\.julia\packages\FileIO\QkYgA\src\loadsave.jl:110
 [9] top-level scope
   @ REPL[10]:1

julia> 
codecov[bot] commented 2 years ago

Codecov Report

Merging #94 (3aa57d1) into master (0065ef6) will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master      #94   +/-   ##
=======================================
  Coverage   95.07%   95.07%           
=======================================
  Files           6        6           
  Lines         264      264           
=======================================
  Hits          251      251           
  Misses         13       13           
Impacted Files Coverage Δ
src/parsing.jl 97.50% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 0065ef6...3aa57d1. Read the comment docs.

tlnagy commented 2 years ago

Hi @HsupoLeng :wave:, thanks for writing up this detailed PR and coming up with a potential fix. Just so I understand the layout of this image, you have 3 XYCT images that you took every 20mins that consist of 512x512 images with 2 channels and 10 timepoints?

I think you've diagnosed the problem correctly. Your image was resetting the IFD ids between each "cycle" so OMETIFF was blindly overwriting the IFD_index. I believe a more general solution than the one you proposed is to set prev_ifd to the maximum previously observed IFD index instead of relying on the number of files.

OMEXML output ```julia julia> println(OMETIFF.dump_omexml("TSeries_camp-000_Cycle00001_Ch1_000001.ome.tif")) 2009-01-01T00:10:57 urn:uuid:b9ded0aa-06b6-4935-8adf-ea0602e1dcb6 urn:uuid:8e4d0da0-acfd-4f55-888a-16a197bb0c65 urn:uuid:bf96b4ce-b8ac-4b23-96fb-7971ac837e7f urn:uuid:d6b5248a-b256-4976-be10-bf99252e1660 urn:uuid:755c8b0c-8468-4d51-87d4-9f269bd9af17 urn:uuid:9a2985c0-821f-44e6-9949-535c531f9643 urn:uuid:e933c9e3-afea-4812-b7a5-58c1133629e2 urn:uuid:f4d1d816-19d4-4f44-8aab-4dc2058df62f urn:uuid:fbd3aca0-3ca7-4f1f-b0b2-36fd2d4eb19e urn:uuid:b4a12f10-fe4a-449d-bfe7-c9f5bf75c613 urn:uuid:04a22fbf-977c-4d89-a84d-03d6484a0eb5 urn:uuid:8f03927c-3a27-4b12-bba1-726bbf73759a urn:uuid:37349bb5-dec3-4c74-897a-523127d3d468 urn:uuid:801fb638-80ac-4c62-b7ad-ed42c7011a37 urn:uuid:bcd3a94b-8141-45ea-9b87-4682090f016a urn:uuid:99be8c16-2a32-45a5-beef-8392c6d48fd9 urn:uuid:ceca837b-8632-41bb-94de-46f6cc28303f urn:uuid:3f95c1a7-f8eb-4de5-a27c-64b5065b65f1 urn:uuid:b3c6849a-d252-4c97-bcbd-251cdd80fdd7 urn:uuid:2cdff66c-d7ab-42c8-9557-bb4587a999c4 urn:uuid:8c9f6108-d597-4101-b6a8-0668b0ff533e urn:uuid:f9564026-3bfc-492f-a4b0-13bc0aff4ffe 2009-01-01T00:11:17 urn:uuid:25be513b-a92f-45da-a4cc-a3f7bf7ba225 urn:uuid:99f8af09-3b6c-4686-ac5b-bf3384fd18c4 urn:uuid:e080ea4e-8825-49b1-8de6-859d2aed8881 urn:uuid:c5778e5f-0e75-498b-b264-55bd6b80c149 urn:uuid:98b05b38-6ce3-486d-ab53-f2c18ec87b83 urn:uuid:9a139b46-302a-4a73-bfde-8a496fa560e3 urn:uuid:022d26cc-36b8-4d71-9846-e749675f7e00 urn:uuid:7cb39a66-b279-4c59-94d9-e59e8b7200f0 urn:uuid:e8aaac2d-5b43-4225-b030-33b94855d714 urn:uuid:93906ff8-7ae9-4020-813c-4aa87664d11d urn:uuid:793ab96e-b657-44ed-9aad-9233d324dc20 urn:uuid:a1319be1-3d05-4ff9-b6a0-05de321b37ba urn:uuid:4f3b30af-0b60-4ac4-8cbf-e404b6fa1cb2 urn:uuid:eac8b8a3-51b7-4b1d-9a2d-8f7fb967a427 urn:uuid:a56d4d47-3199-414d-a268-8ba973b443ee urn:uuid:5ff69a2f-3c2b-447c-b9d2-603085251e1f urn:uuid:cd7bd213-a24d-4ec9-af5b-79c74a5e2ef7 urn:uuid:b1fa8644-314c-4024-b6e4-a681dffcae3e urn:uuid:d6a3801b-fb2e-43c6-a20e-53bed2bd74fd urn:uuid:d434eae6-8b62-4efa-98c7-437cb4e2c4a8 urn:uuid:3f2901eb-5571-4c0e-966f-8cda2220ee57 urn:uuid:ceb2db31-ddc1-486c-a604-145c4b651375 2009-01-01T00:11:37 urn:uuid:eaa9a2aa-b641-40cd-a674-3e9271a6b61d urn:uuid:f2f3d1e2-b63f-4b6e-bcbc-09deef300db4 urn:uuid:25fbc4e1-e67d-4f15-a510-c76effb5a784 urn:uuid:74b8b7f4-4e33-4ccc-8150-3d85eb01c7d3 urn:uuid:91679922-8d5b-4603-b168-d9ba836f65b7 urn:uuid:8146ef45-aba0-4dc6-bd93-91912426afde urn:uuid:fb14d80e-ca0b-48c6-9b2f-aa3d948a8420 urn:uuid:677d73b5-f893-45a8-aa2d-9db2875e8da5 urn:uuid:2bd85c64-f616-4741-8c32-4bdb81709375 urn:uuid:29498cdb-d088-4cf8-b516-a5e0e4217058 urn:uuid:5a4068af-4349-4d18-b445-622a1282aed6 urn:uuid:cbeaf311-0f9e-474a-97fb-8ab2bf758d8b urn:uuid:54102a18-7193-45d9-a158-bc9d635ae274 urn:uuid:924c7648-5975-4a15-8097-92c6f99ec9ac urn:uuid:a967e762-d710-4f47-9fad-75593bf8f632 urn:uuid:26827358-aea5-4da1-bf9b-dce973720c29 urn:uuid:fa797ff7-cc2b-4d86-9116-6c653ec54ac6 urn:uuid:7a0ed47a-c6ec-4ee8-a528-53d1e3c1f3d3 urn:uuid:0880af1f-0030-4fbd-bc72-af9b77152864 urn:uuid:70f18458-ba92-4e7d-8673-c4425cb46b00 urn:uuid:a5515e1f-a4e2-4127-ac75-d8e88cd392d7 urn:uuid:cbf47f11-fa82-4d66-be83-6444659632b6 ```
tlnagy commented 2 years ago

Do you mind opening a PR against https://github.com/tlnagy/exampletiffs? It would be nice to add some tests with those images so we can avoid any regressions in the future.

HsupoLeng commented 2 years ago

Hi @tlnagy , you are mostly right about the image's layout. These are 3 XYCT images that I took every 20 seconds that consists of 512*512 images with 2 channels and 11 timepoints.

Your solution still perfectly resolved my issue. I agree that your more general fix works better: It's more appropriate to use ifd_index than obs_filepaths, considering the case where the same IFD is split across files or the case where tiffdata["IFD"] is non-zero. About the dense numbering assumption, I am not familiar enough with ome.tiff to add more information.

I will open a PR against https://github.com/tlnagy/exampletiffs soon - I would like to first prepare a smaller set of images that shows the same case. I will also add a few tests for them.

tlnagy commented 2 years ago

:+1: Glad to hear that it works for you. Yeah, in your case the original solution happens to work because each IFD is in a new file so length(obs_filepaths) == length(ifd_index) but that isn't required to be always true.

I documented the dense numbering assumption so that if I happen to run into a weird edge case in the future where this assumption causes problem, I can diagnose the problem more quickly.

HsupoLeng commented 2 years ago

I opened a pull request to exampletiffs with a small set of example images and added test against the example in my latest commit. I tested load and image dimensions following most other tests, but please feel free to update if there are cases that I didn't consider.

tlnagy commented 2 years ago

Great! Thanks for following up and adding the tests. I merged https://github.com/tlnagy/exampletiffs/pull/2 and restarted the tests. Hopefully they'll all be green now.

I'll squash and merge once it passes.

tlnagy commented 2 years ago

Once the tests pass again on master, I'll tag and release v0.4.1, which will feature your fix.