ome / bioformats

Bio-Formats is a Java library for reading and writing data in life sciences image file formats. It is developed by the Open Microscopy Environment. Bio-Formats is released under the GNU General Public License (GPL); commercial licenses are available from Glencoe Software.
https://www.openmicroscopy.org/bio-formats
GNU General Public License v2.0
370 stars 241 forks source link

PyramidOMETiffWriter `close()` duration scales as a (number of planes)^2 #4204

Open NicoKiaru opened 4 days ago

NicoKiaru commented 4 days ago

Hello,

Original post: https://forum.image.sc/t/pyramidometiffwriter-really-bad-performance-when-closing-the-writer/95294

There is a (very big) performance issue when saving an OME TIFF file with PyramidOMETiffWriter as the close() method duration scales as the number of planes squared.

I've tested to save a 512x512 pixel file on my computer and increase the number of timepoints by powers of 2 (nT). I've measured the time required to save the file and split the required time between what happens before calling close, and the close duration:

Export Time (ms) = Data writing (ms) + Reader.close() duration (ms) 
nPixX nPixY nT Export Time (s) Reader.close() duration (s) Data writing (s) Reader close duration in total export time (%)
512 512 1 0.5 0.2 0.4 32 %
512 512 2 0.7 0.2 0.5 27 %
512 512 4 0.7 0.2 0.5 32 %
512 512 8 1.0 0.3 0.7 30 %
512 512 16 1.4 0.4 1.0 29 %
512 512 32 2.1 0.7 1.4 31 %
512 512 64 3.8 1.7 2.2 43 %
512 512 128 8.9 5.2 3.7 59 %
512 512 256 20.2 14.4 5.8 71 %
512 512 512 64.0 51.6 12.5 80 %
512 512 1024 290.4 262.6 27.8 90 %
             
16384 16384 1 86.2 0.08 86.1 0.1 %

The data writing time scales linearly with the number of timepoints, which is what we expect. However the close time is multiplied by 4 each time the number of timepoints is multiplied by 2.

As a result, 'close' takes 90% of the total export time when there are 1024 timepoints.

I've narrowed the issue to the call of TiffParser#getIFDOffsets for all planes, getIFDOffsets walking the whole file from the start. This scales very badly.

https://github.com/ome/bioformats/blob/7ca9980651a5e776cd79ae4bb8f558eaa27bbb27/components/formats-bsd/src/loci/formats/out/PyramidOMETiffWriter.java#L125

https://github.com/ome/bioformats/blob/7ca9980651a5e776cd79ae4bb8f558eaa27bbb27/components/formats-bsd/src/loci/formats/out/PyramidOMETiffWriter.java#L145

https://github.com/ome/bioformats/blob/7ca9980651a5e776cd79ae4bb8f558eaa27bbb27/components/formats-bsd/src/loci/formats/tiff/TiffSaver.java#L804

https://github.com/ome/bioformats/blob/7ca9980651a5e776cd79ae4bb8f558eaa27bbb27/components/formats-bsd/src/loci/formats/tiff/TiffParser.java#L362-L363

Is there a way to fix this ?


This issue could be related to these other issues:

https://github.com/ome/bioformats/issues/3983 https://github.com/ome/bioformats/issues/3480

imagesc-bot commented 4 days ago

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/pyramidometiffwriter-really-bad-performance-when-closing-the-writer/95294/10