Closed elyall closed 1 year ago
Nevermind, it appears those frames do not actually exist as they are only zero arrays when they do get returned. In which case it does appear the metadata is in fact wrong as it promises frames that don't exist.
The issue is that the file only contains 16 pages/IFDs while the metadata references 24. Tifffile should either return an array with zeroed frames 16-24 or revert to a generic series with 16 frames. I'll try to fix this in the next version.
I would have saved myself plenty of time by just examining the whole stack. Conda had installed a 2020 version of tifffile but when I updated to the latest version the warning OME series expected 24 frames, got 16
helped me realize the issue.
I could see an argument for returning empty frames, but in this case it caused me confusion that the two methods for accessing the data didn't match up. It's up to you whether you consider it a bug or not.
Should be fixed in v2023.7.4.
First, thanks for your amazing work.
I have a 24 page tiff (download link) that I am trying to lazy load pages from. Both
imageio
'stifffile_v3
plugin anddask_image
(which both usetifffile
) fail to load pages with indices >15, both returning aIndexError: list index out of range
. It appearstif.pages
for these files only contains the first 16 pages:Is this meant to happen? Maybe it's an issue with the file's metadata?
More context should it be helpful:
This issue is only when you specify the index of the page to load as using any of the three packages to read the whole file results in the whole stack being returned.
The file is from a Cytena CellcyteX. The stack contains longitudinal images of an individual field that I concatenate with slices from other fields to produce a full image. I parallelize pre-processing using ray such that an individual task loads a specific slice from multiple files, concatenates them, and processes the complete frame.
Until now I've been processing single page tifs and my pattern has been to wrap
imageio.imread
in adask.array.from_delayed
call resulting in each file getting mapped to a single task. For multipage tifs I understand this pattern will result in each file being opened and closed many times, but the benefit I see is that all pages will not have to be loaded at the same time on one machine, nor will slices then have to be pickled to be passed between tasks which are often on separate machines. But I'm also open to feedback/suggestions.Edit: updated tifffile to latest version and re-tested