BodenmillerGroup / readimc

Python package for reading imaging mass cytometry (IMC) files
https://bodenmillergroup.github.io/readimc
MIT License
12 stars 5 forks source link

large panorama image reading #16

Open leorrose opened 1 year ago

leorrose commented 1 year ago

I have a mcd file with 3 panoramas. While trying to read the panoramas I get an error :

---------------------------------------------------------------------------
DecompressionBombError                    Traceback (most recent call last)
File [~/.conda/envs/tnbc/lib/python3.9/site-packages/readimc/mcd_file.py:211](https://vscode-remote+ssh-002dremote-002b132-002e72-002e65-002e230.vscode-resource.vscode-cdn.net/sise/assafzar-group/assafzar/leor/chapter_two/TNBC/imc/notebooks/~/.conda/envs/tnbc/lib/python3.9/site-packages/readimc/mcd_file.py:211), in MCDFile.read_panorama(self, panorama)
    210 try:
--> 211     return self._read_image(
    212         data_start_offset, data_end_offset - data_start_offset
    213     )
    214 except Exception as e:

File [~/.conda/envs/tnbc/lib/python3.9/site-packages/readimc/mcd_file.py:339](https://vscode-remote+ssh-002dremote-002b132-002e72-002e65-002e230.vscode-resource.vscode-cdn.net/sise/assafzar-group/assafzar/leor/chapter_two/TNBC/imc/notebooks/~/.conda/envs/tnbc/lib/python3.9/site-packages/readimc/mcd_file.py:339), in MCDFile._read_image(self, data_offset, data_size)
    338 data = self._fh.read(data_size)
--> 339 return imread(data)

File [~/.conda/envs/tnbc/lib/python3.9/site-packages/imageio/v2.py:227](https://vscode-remote+ssh-002dremote-002b132-002e72-002e65-002e230.vscode-resource.vscode-cdn.net/sise/assafzar-group/assafzar/leor/chapter_two/TNBC/imc/notebooks/~/.conda/envs/tnbc/lib/python3.9/site-packages/imageio/v2.py:227), in imread(uri, format, **kwargs)
    226 with imopen(uri, "ri", **imopen_args) as file:
--> 227     result = file.read(index=0, **kwargs)
    229 return result

File [~/.conda/envs/tnbc/lib/python3.9/site-packages/imageio/core/legacy_plugin_wrapper.py:147](https://vscode-remote+ssh-002dremote-002b132-002e72-002e65-002e230.vscode-resource.vscode-cdn.net/sise/assafzar-group/assafzar/leor/chapter_two/TNBC/imc/notebooks/~/.conda/envs/tnbc/lib/python3.9/site-packages/imageio/core/legacy_plugin_wrapper.py:147), in LegacyPlugin.read(self, index, **kwargs)
    145     return img
--> 147 reader = self.legacy_get_reader(**kwargs)
    148 return reader.get_data(index)

File [~/.conda/envs/tnbc/lib/python3.9/site-packages/imageio/core/legacy_plugin_wrapper.py:116](https://vscode-remote+ssh-002dremote-002b132-002e72-002e65-002e230.vscode-resource.vscode-cdn.net/sise/assafzar-group/assafzar/leor/chapter_two/TNBC/imc/notebooks/~/.conda/envs/tnbc/lib/python3.9/site-packages/imageio/core/legacy_plugin_wrapper.py:116), in LegacyPlugin.legacy_get_reader(self, **kwargs)
    115 self._request.get_file().seek(0)
...
    217         f"MCD file '{self.path.name}' corrupted: "
    218         f"cannot read image for panorama {panorama.id}"
    219     ) from e

OSError: MCD file 'Leap004.mcd' corrupted: cannot read image for panorama 2

This is not an expected behavior because if I use MCD Viewer I'm able to read everything an nothing is corrupted.

Any suggestion what can cause this? (unfortunately I cannot share the data itself)

leorrose commented 1 year ago

I think I found the problem. imageio which is used to read data uses PIL as backend. PIL limits image size for security reasons (Decompression bomb protection) which causes an error when reading panoramas of very large size.

The solution is to change the max pixel setting in PIL:

from PIL import Image
Image.MAX_IMAGE_PIXELS = 1000000000

Also maybe it will be wise to change the readimc code to give more description of errors. For example in this case we could print the error and see that it's an issue with image size and not a corrupted mcd file.

jwindhager commented 1 year ago

Correct, this is due to PIL's max. image size. Unfortunately, it is not straight-forward to catch this specific error in the readimc code, which is why we haven't done that. Also, imageio may switch to other backends in the future (out of our control). But if you have a suggestion where/how to improve things, that would be great! PRs welcome :-)

jwindhager commented 1 year ago

Keeping this issue open as a reminder to document this

sandip-shah commented 1 year ago

What can be done is catch the exception, and geometrically keep on extending the MAX_IMAGE_PIXELS with a warning / logging; .24 GB limit set by PIL is too low for images these days. One can easily go as high as, let's say, 5 GB these days before giving up (of course, after checking that there is that much free disk space).