libvips / pyvips

python binding for libvips using cffi
MIT License
649 stars 50 forks source link

Reading in image a second time breaks on qptiff file #453

Open idc9 opened 9 months ago

idc9 commented 9 months ago

Reading in an image from a qptiff file breaks when I run the same code twice.

If I run the below once it works. If I run it again (I'm using a jupyter notebook) it throws an error. If I run it a third time it throws a different error.

I'm not able to post the image publicly but I've emailed it to @jcupitt.

import pyvips
import matplotlib.pyplot as plt
fpath = 'myfile.qptiff'

page_idx = 17
location = (0, 0)
size = (375, 483)  # this is the entire image

# read patch via crop
vips_image = pyvips.Image.new_from_file(vips_filename=fpath, page=page_idx)
patch_crop = vips_image.crop(location[0], location[1], size[0], size[1])
patch_crop = patch_crop.numpy()
plt.imshow(patch_crop)

Error message after second call

---------------------------------------------------------------------------
Error                                     Traceback (most recent call last)
Cell In[4], line 11
      9 vips_image = pyvips.Image.new_from_file(vips_filename=fpath, page=page_idx)
     10 patch_crop = vips_image.crop(location[0], location[1], size[0], size[1])
---> 11 patch_crop = patch_crop.numpy()# [:, :, 0:3]
     12 plt.imshow(patch_crop)

File ~/anaconda3/envs/cpath/lib/python3.8/site-packages/pyvips/vimage.py:1273, in Image.numpy(self, dtype)
   1252 def numpy(self, dtype=None):
   1253     """Convenience function to allow numpy conversion to be at the end
   1254     of a method chain.
   1255 
   (...)
   1271           strings to numpy dtype strings.
   1272     """
-> 1273     return self.__array__(dtype=dtype)

File ~/anaconda3/envs/cpath/lib/python3.8/site-packages/pyvips/vimage.py:1234, in Image.__array__(self, dtype)
   1208 """Conversion to a NumPy array.
   1209 
   1210 Args:
   (...)
   1229 See Also `Image.new_from_array` for the inverse operation. #TODO
   1230 """
   1231 import numpy as np
   1233 arr = (
-> 1234     np.frombuffer(self.write_to_memory(),
   1235                   dtype=FORMAT_TO_TYPESTR[self.format])
   1236     .reshape(self.height, self.width, self.bands)
   1237 )
   1239 if self.bands == 1:
   1240     # flatten single-band images
   1241     arr = arr.squeeze(axis=-1)

File ~/anaconda3/envs/cpath/lib/python3.8/site-packages/pyvips/vimage.py:944, in Image.write_to_memory(self)
    942 pointer = vips_lib.vips_image_write_to_memory(self.pointer, psize)
    943 if pointer == ffi.NULL:
--> 944     raise Error('unable to write to memory')
    945 pointer = ffi.gc(pointer, glib_lib.g_free)
    947 return ffi.buffer(pointer, psize[0])

Error: unable to write to memory
  tiff2vips: out of order read -- at line 483, but line 0 requested

Error message after third call

---------------------------------------------------------------------------
Error                                     Traceback (most recent call last)
Cell In[5], line 11
      9 vips_image = pyvips.Image.new_from_file(vips_filename=fpath, page=page_idx)
     10 patch_crop = vips_image.crop(location[0], location[1], size[0], size[1])
---> 11 patch_crop = patch_crop.numpy()
     12 plt.imshow(patch_crop)

File ~/anaconda3/envs/cpath/lib/python3.8/site-packages/pyvips/vimage.py:1273, in Image.numpy(self, dtype)
   1252 def numpy(self, dtype=None):
   1253     """Convenience function to allow numpy conversion to be at the end
   1254     of a method chain.
   1255 
   (...)
   1271           strings to numpy dtype strings.
   1272     """
-> 1273     return self.__array__(dtype=dtype)

File ~/anaconda3/envs/cpath/lib/python3.8/site-packages/pyvips/vimage.py:1234, in Image.__array__(self, dtype)
   1208 """Conversion to a NumPy array.
   1209 
   1210 Args:
   (...)
   1229 See Also `Image.new_from_array` for the inverse operation. #TODO
   1230 """
   1231 import numpy as np
   1233 arr = (
-> 1234     np.frombuffer(self.write_to_memory(),
   1235                   dtype=FORMAT_TO_TYPESTR[self.format])
   1236     .reshape(self.height, self.width, self.bands)
   1237 )
   1239 if self.bands == 1:
   1240     # flatten single-band images
   1241     arr = arr.squeeze(axis=-1)

File ~/anaconda3/envs/cpath/lib/python3.8/site-packages/pyvips/vimage.py:944, in Image.write_to_memory(self)
    942 pointer = vips_lib.vips_image_write_to_memory(self.pointer, psize)
    943 if pointer == ffi.NULL:
--> 944     raise Error('unable to write to memory')
    945 pointer = ffi.gc(pointer, glib_lib.g_free)
    947 return ffi.buffer(pointer, psize[0])

Error: unable to write to memory

System details

import platform; print(platform.platform())
import sys; print('python', sys.version)
import pyvips; print('pyvips', pyvips.__version__)
import openslide; print('openslide', openslide.__version__)
macOS-10.16-x86_64-i386-64bit
python 3.8.18 (default, Sep 11 2023, 08:17:33) 
[Clang 14.0.6 ]
pyvips 2.2.1
openslide 1.3.0

I installed pyvips using conda install.

jcupitt commented 9 months ago

Hi again @idc9,

You'll get that error if you make an out of order read from a sequential image. Perhaps you passed access="sequential" by mistake?

It's usually best to post small, complete, runnable programs rather than code fragments, or I won't be able to test them unless I spend some time writing a code harness.

I tried:

#!/usr/bin/env python3

import sys
import pyvips

level = int(sys.argv[2])

image = pyvips.Image.new_from_file(sys.argv[1], page=level)
for _ in range(100):
    patch = image.crop(0, 0, image.width, image.height).numpy()

Then ran it with:

$ vips copy ~/pics/nina.jpg x.tif[pyramid,compression=jpeg,tile]
$ ./crop2.py x.tif 6
$

Add it seems to work.

jcupitt commented 9 months ago

I had a look at your test file, it's a Perkin-Elmer QPI, so it won't work with a standard TIFF loader. You'll need a specialized load library for this file format.

Perhaps you could save as OME-TIFF? That should load OK.

idc9 commented 9 months ago

Thanks again for the very fast response!

Access

It's usually best to post small, complete, runnable programs rather than code fragments

That is a complete runnable program (edited to add the imports!), not a part of a larger script. I had not previously called 'access=sequential'`.

In fact the exact same behavior occurs when I try any of the access options (None, sequential, random). Note I restart the notebook before calling the different options.

vips_image = pyvips.Image.new_from_file(vips_filename=fpath, page=page_idx)
vips_image = pyvips.Image.new_from_file(vips_filename=fpath, page=page_idx, access='random')
vips_image = pyvips.Image.new_from_file(vips_filename=fpath, page=page_idx, access='sequential')

qptifff

pyvips seems to work otherwise with this image format. So far the only issue I've encoutnered is this double loading issue. This seems like a bug in pyvips to me since it works on the first load, but not the second load.

Perhaps you could save as OME-TIFF?

That would be a problem for our application; we would like to use pyvips on the original image.

jcupitt commented 9 months ago

pyvips seems to work otherwise with this image format

It's a plane-separated pyramid, unfortunately, so it won't work. It's to do with the way the reduced resolution levels are coded -- OME-TIFF (which libvips does support) puts the reduced res levels into subifds, but this QPI has them all one after the other, with the metadata about the file structure coded into the XML in the imagedescription.

Have a look at the tiffinfo output (and get a large mug of tea ready first).

idc9 commented 9 months ago

Does the fact that reading in the image twice throws an error not suggest there is an issue with pyvips?

jcupitt commented 9 months ago

That's just a minor symptom -- the cause is this TIFF file being way outside the supported range for any general purpose TIFF reader.

Options off the top of my head:

  1. You could probably read it with a low-level TIFF library, like tifffile, but getting a nice RGB image will take some work. Maybe a week's effort? It depends how much of the QPI format you want to support.
  2. I expect PE will sell you an SDK (I've not checked).
  3. The gold standard solution would be to add support for the format to openslide. That would be (guess?) several weeks of effort and you'd need a good C programmer.
  4. ... and of course, another option is to switch to a format that's not locked to a vendor, or a scanner manufacturer who don't attack their customers by forcing them to use a non-interoperable format. Most of the people I know are moving to wg26 DICOM now, perhaps PE support something like that? I've not looked.