tlnagy / OMETIFF.jl

I/O operations for OME-TIFF files in Julia
Other
24 stars 7 forks source link

Border slice lost when loading OOM multi-file image #104

Closed tlnagy closed 1 year ago

tlnagy commented 1 year ago
julia> img = FileIO.load("dish2_ctrl_v_bix_1_MMStack.ome.tif", inmemory = false)^C

julia> size(img)
(1024, 1024, 3, 361, 2)

julia> for i in 1:3, j in 1:361, k in 1:2
           try
               img[:, :, i, j, k].data
           catch
               println(i, " ", j, " ", k)
           end
       end
1 338 2

Appears as though the slice isn't added to the IFD list so it fails on access:

julia> img[:, :, 1, 338, 2]
ERROR: KeyError: key (1, 1, 338, 2) not found
Stacktrace:
  [1] getindex
    @ ~/.julia/packages/OrderedCollections/PRayh/src/ordered_dict.jl:380 [inlined]
  [2] getindex(::OMETIFF.DiskOMETaggedImage{ColorTypes.Gray{FixedPointNumbers.N0f16}, 4, 6, UInt32, Matrix{ColorTypes.Gray{FixedPointNumbers.N0f16}}}, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64, ::Int64)
    @ OMETIFF ~/.julia/dev/OMETIFF/src/mmap.jl:54
  [3] _unsafe_getindex_rs
    @ ./reshapedarray.jl:250 [inlined]
tlnagy commented 1 year ago

This slice corresponds to either the first slice of the new file or the last slice of previous file:

julia> filepath = ""
       for ifd in img.data.data.parent.ifds
           fp = ifd[2][1].filepath
           if filepath != fp
               filepath = fp
               println(ifd[1])
           end
       end
(1, 1, 1, 1)
(2, 1, 338, 2)

Dumping the OMEXML we can see that (2,1,338,2) corresponds to the first slice of the new file (aka FirstC=2, FirstT=337, IFD = 0):

      <TiffData FirstC="2" FirstT="336" FirstZ="0" IFD="2021" PlaneCount="1">
        <UUID FileName="dish2_ctrl_v_bix_1_MMStack.ome.tif">urn:uuid:7ca94c93-4599-4b79-a63b-814a6f24db67</UUID>
      </TiffData>
      <TiffData FirstC="0" FirstT="337" FirstZ="0" IFD="2025" PlaneCount="1">
        <UUID FileName="dish2_ctrl_v_bix_1_MMStack.ome.tif">urn:uuid:7ca94c93-4599-4b79-a63b-814a6f24db67</UUID>
      </TiffData>
      <TiffData FirstC="1" FirstT="337" FirstZ="0" IFD="0" PlaneCount="1">
        <UUID FileName="dish2_ctrl_v_bix_1_MMStack_1.ome.tif">urn:uuid:b26d9310-e6cc-4167-89b2-120e83fb646e</UUID>
      </TiffData>
      <TiffData FirstC="2" FirstT="337" FirstZ="0" IFD="1" PlaneCount="1">
        <UUID FileName="dish2_ctrl_v_bix_1_MMStack_1.ome.tif">urn:uuid:b26d9310-e6cc-4167-89b2-120e83fb646e</UUID>
      </TiffData>
tlnagy commented 1 year ago

So this actually seems to be happening because of the slices split weirdly across the two files. The first position discovers the second file, but the first position's last IFD in the first file isn't the last IFD so the offset for the second files IFD's is wrong.

tlnagy commented 1 year ago

After digging into this a bit more, this is a classic example of "clobbering." When switching to a new file, I write down the largest IFD that we've seen thus far as the "offset". Importantly, this isn't guaranteed to be the largest IFD in the previous file! This is because we're only looking at the IFDs per-position[^1] so if the first position doesn't include the final IFD in the first file, the "offset" in the 2nd file will be wrong. Thus if we find a larger IFD in the first file, we need to shift all future IFDs by the difference between the current wrong offset and the new larger, correct offset.

[^1]: Whether this is a good idea at all is worth looking into