knime-ip / knip

KNIME Image Processing Extension
https://www.knime.com/community/image-processing
49 stars 11 forks source link

Global Thresholder generates large filestore #501

Closed stelfrich closed 5 years ago

stelfrich commented 6 years ago

When a workflow with a Global Thresholder is saved with its data, the created filestore/ folder can be up to twice the size of the input images.

Hypothesis: this might be due to a suboptimal serialization of BitType images.

imagejan commented 6 years ago

See also https://github.com/imglib/imglib2/pull/217 where this was fixed. Thanks, @gab1one! Now it's just a question of when KNIP will start using the relevant ImgLib2 release as a dependency, I guess.

gab1one commented 6 years ago

We (@stelfrich, @awalter17) discovered another cause for this, we are serializing color-tables, which are created programmatically per plane, creating an overhead of 3 times 65536 shorts per plane. This is only slightly noticeable on images with few planes, but where dramatic on volumetric or time series images with many small-ish planes. In those cases the overhead introduced by this can be many times larger than the actual intensity values. We need to investigate how we can get rid of this.

A first idea is that we try to use a lookup table implementation that is computed on access, instead of ColorTable16 which should greatly reduce the memory footprint of such images.

gab1one commented 6 years ago

After a deeper dive into the image metadata serialization, we discovered that we appear to be serializing the same ColorTable multiple times. I will work on an a new metadata externalizer which will not have this defect.

gab1one commented 5 years ago

@imagejan

Now it's just a question of when KNIP will start using the relevant ImgLib2 release as a dependency, I guess.

This is the case with the latest release, 1.7.0, the next release 1.7.1 we will also include #503 which further reduces the size of images with many planes. Additionally I am working on improving SCIFIO and SCIFIO-BF-Compat with "virtualized" ColorTables with greatly reduced memory footprint.