Open sbesson opened 4 years ago
This issue has been mentioned on Image.sc Forum. There might be relevant details there:
https://forum.image.sc/t/cropping-large-pyramidal-tiffs-with-bfconvert-crashes/44857/2
This is a situation where I think the best solution is at the model level, as detailed below.
You don't want to change any of the raw tile data. Particularly when lossy compression is used. In order to correctly preserve the data, you could treat the crop region as a bounding box, and select all tiles which intersect the region for retention. All non-intersecting tiles can be discarded. This will handle overlapping tiles at all resolution levels. In order to correctly set the crop region for rendering and the image size, you will need a separate bit of metadata in Pixels (or attached to Pixels via a reference) which describes the bounding box, for example an x+y offset and an x+y size. It would also generalise to z and t, and potentially c as well.
With this metadata, you would then have two sizes: the render/viewport size and the underlying data size. Tools will need to select the region to work with. bfconvert would be able to use this when cropping. It will also permit repeated cropping and each time will downscale the region and drop any tiles no longer intersecting.
One complication to consider is that you will need to also store the original size (best) / scale factor (lossy) of each resolution level, so that the bounding region can be correctly computed at each level. We currently infer this from e.g. TIFF SUBIFDs and this will no longer be sufficient: the subresolution sizes will depend upon the tile size at each level (which can vary), so it will be impossible to accurately infer anything. We previously discussed adding full modelling of subresolutions into the data model; you'll need it for this to work properly. Might be worth considering having multiple Pixels
elements or SubResolutionPixels
within Pixels
which will inherit all of the settings but have a different resolution and tile size. Might also be worth modelling the tile size/strip size as well.
Note: this would also be useful for bounding deconvolution artefacts; though you might want a different name to describe that. Perhaps a general annotation which can be used for multiple purposes would fit these needs?
One other consideration. You might want to think about whether the bounding region should be in pixels at the full resolution, or in physical units.
Another consideration. You might also want to think about having multiple bounding regions. For example, this would be useful for modelling WSI overview scans with regions linking to detailed scans.
This issue has been mentioned on Image.sc Forum. There might be relevant details there:
See https://forum.image.sc/t/cropping-large-pyramidal-tiffs-with-bfconvert-crashes/44857
The issue can be reproduced using a pyramidal fake image and trying to convert and crop it
With Bio-Formats 6.5.1, this will result in a
FormatException
when checking for the tile size as soon as a resolution of size smaller than the crop size is reached. One of the underlying issue is that the cropped region is never adjusted at each resolution scale. This is not that a simple scenario which involves computing the downsampling scale as well as handling rounding errors. The current workaround is to first crop the largest resolution "only", then convert it into a pyramidal image by recomputing the resolution levels.A variation of this issue had been reported in https://forum.image.sc/t/bfconvert-throwing-loci-formats-formatexception-for-a-czi-file/37478 where the number of requested resolutions is larger than the number of resolutions in the image but the downsampling is inconsistent.
As a general rule, it feels like like for multi-resolution images, we should only support one of the two cases:
Two additional issues specifically related to cropping is the modification of any positional metadata or region of interests as well as the absence of recording of the crop region in the metadata.
For upcoming release, we should probably define a set of recommended or disallowed scenarios, and minimally fail fast with an informative message for some combination of options and image metadata.