ome / bioformats

Bio-Formats is a Java library for reading and writing data in life sciences image file formats. It is developed by the Open Microscopy Environment. Bio-Formats is released under the GNU General Public License (GPL); commercial licenses are available from Glencoe Software.
https://www.openmicroscopy.org/bio-formats
GNU General Public License v2.0
381 stars 241 forks source link

VSI->TIF converted images with JPEG compression throws errors in GIMP #3372

Open ebremer opened 5 years ago

ebremer commented 5 years ago

Trying to view a VSI->TIF converted image with JPEG compression in GIMP throws the following error over and over:

Improper JPEG sampling factors 2,2 Apparently should be 1,1.

mtbc commented 5 years ago

Confirmed: Using Bio-Formats 6.0.1 and one of our own sample files:

$ bfconvert -version
Version: 6.0.1
Build date: 14 March 2019
VCS revision: f47f5e3a611a76d52548b9e22b696498fcf0e1b1
$ bfconvert -compression JPEG Image_01_V76\ K2\ 11_lung.vsi Image_01_V76\ K2\ 11_lung.tiff
...
$ gm -version
GraphicsMagick 1.4 snapshot-20181020 Q16 http://www.GraphicsMagick.org/
...
$ gm display Image_01_V76\ K2\ 11_lung.tiff 
gm display: Improper JPEG sampling factors 2,2
Apparently should be 1,1.. (JPEGPreDecode).
$ 
ebremer commented 4 years ago

I was playing around more with my weird image color problem on https://github.com/ome/bioformats/issues/3462 and it appears that it's a cymk/rgb color model issue. If I read the weird colored image (and assume it's cymk as opposed to rgb - not sure as the bytes were retrieve from a VSI file using bioformats) and convert it to RGB, it looks fine. ImageIO throws errors when trying to read cymk images and apparently ImageIO is only able to deal with RGB, not CYMK. On digging through the code, I found that ome.codecs.JPEGCodec uses IOImage on writes. I'm still fishing around in the code but I thought I would share the CYMK/RGB revelation as it partly solves my problem.

dgault commented 4 years ago

Thanks for the continued investigation @ebremer. We have begun digging into it on our side too and it certainly sounds like the CYMK/RGB revelation may well be important here.

markemus commented 3 years ago

I'm having this problem as well. I tested it on a set of [.ndpi, ventana .tif, aperio .tif, .czi]. Of those, all four successfully converted with LZW compression. With JPEG, the ventana and aperio .tifs both succeeded, while the .czi and .ndpi failed.

dgault commented 3 years ago

Hi @markemus, did they fail during the conversion or was it only once you tried to open the converted images after that you noticed? Do you have the specific error messages or stack trace?

markemus commented 3 years ago

I'm calling the command line from python 3.6.

The error occurs after the bioformats conversion completes, apparently successfully. I'm calling tiffcp on the output file after the bioformats conversion has completed in order to create a tif with a contiguous planar configuration (bioformats uses "separate" planar configuration).

Stack trace for the czi:

JPEGPreDecode: Improper JPEG sampling factors 2,2
Apparently should be 1,1..
/home/markemus/data/dev/mixed_types/2020_11_22__17_01__0165_temp.tif: Error, can't read tile at 0 0.
ERROR:/home/markemus/dev/converttiff/cvt_folder.py:Command '['tiffcp', '-8', '-p', 'contig', '/home/markemus/data/dev/mixed_types/2020_11_22__17_01__0165_temp.tif,0', '/home/markemus/data/dev/mixed_types/2020_11_22__17_01__0165_mimic.tif']' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/home/markemus/dev/converttiff/cvt_folder.py", line 62, in convert
    subprocess.check_call(["tiffcp", "-8", "-p", "contig", temppath+",0", outpath])
  File "/usr/lib/python3.6/subprocess.py", line 311, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['tiffcp', '-8', '-p', 'contig', '/home/markemus/data/dev/mixed_types/2020_11_22__17_01__0165_temp.tif,0', '/home/markemus/data/dev/mixed_types/2020_11_22__17_01__0165_mimic.tif']' returned non-zero exit status 1.

The command for bioformats was:

subprocess.check_call(["bftools/bfconvert", "-series", str(main_idx), "-pyramid-resolutions", str(pyramid_resolutions), "-pyramid-scale", str(pyramid_scale), "-bigtiff", "-compression", compression, path, temppath])

(Side note: the "-8" flag enables bigtiff for tiffcp, but is undocumented)

ebremer commented 3 years ago

This issue came back to haunt me. Trying to save TIFF images with Bioformats 6.6.1. When I try to read the images with Gimp 2.10.24 rev 3, it throws this error and the resultant loaded image is all black.

dgault commented 3 years ago

Thanks @ebremer, I will need to do some deeper investigation into this, I do believe it is due to using ImageIO with CYMK data, so it may be that we either need to find an alternative or try to detect this and perform some conversion to RGB steps beforehand.

ebremer commented 3 years ago

My data in this case isn't CYML, but RGB.

dgault commented 3 years ago

Ok, that sounds like we have a more widespread problem with the JPEG compression.

imagesc-bot commented 3 years ago

This issue has been mentioned on Image.sc Forum. There might be relevant details there:

https://forum.image.sc/t/bfconvert-specifying-chunky-or-planar/56059/4

ebremer commented 3 years ago

The various software (like GIMP) and other software I am using that throw this error, trace back to libtiff to these lines of code: https://gitlab.com/libtiff/libtiff/-/blob/v4.3.0/libtiff/tif_jpeg.c#L1247-1274

ebremer commented 3 years ago

Okay, I got this to work. I needed my images to be chunky and not planar. Once the data was in the write formats though, the above error would be thrown by libtiff-based software. I dug a little further and noticed some discrepancies between bioformat TIFFs and libtiff generated TIFFs. I had to change two tiff tags in the Bioformats code: 1) SAMPLE_FORMAT was being stored as a singular 1, but needed to be three 1's int[] sample_format = {1,1,1}. The spec https://www.awaresystems.be/imaging/tiff/tifftags/sampleformat.html says that it's a "1" for each sample. With a chunky format, bioformats only seemed to save a single 1 instead of 3.

2) PHOTOMETRIC_INTERPRETATION was being set to RGB (3) . libtiff was setting it to a 6 for YCBCR color space which I'm guessing is what happens to the RGB data after JPEG compression. On changing this to a 6 in the bioformats code, Photoshop, GIMP, and a libtiff tiling engine all were able to read the bioformat tiff image output.