libvips / pyvips

python binding for libvips using cffi
MIT License
649 stars 50 forks source link

Pyvips fetch returns bad image on .svs files #452

Open idc9 opened 9 months ago

idc9 commented 9 months ago

Reading in an image from a .svs file via pyvips fetch returns a completeley messed up image. Reading in with the crop method does return the correct image. See example below.

Example

The test image can be downloaded from here.

from openslide import open_slide
import pyvips
import numpy as np
import matplotlib.pyplot as plt

fpath = 'CMU-1.svs'

os_wsi = open_slide(fpath)

# each of these fail
# location = (0, 0)
# location = (2300, 1500)
# location = (2300, 15000)
# location = (23000, 15000)
location = (4000, 20000)

size = (256, 256)
level = 0

# read openslide patch
patch_os = os_wsi.read_region(location=location, size=size, level=level).convert('RGB')
patch_os = np.array(patch_os)

# read patch via crop
vips_image = pyvips.Image.new_from_file(vips_filename=fpath, level=level)
patch_crop = vips_image.crop(location[0], location[1], size[0], size[1])
patch_crop = patch_crop.numpy()[:, :, 0:3]

# read patch via fetch
bands = 3
vips_image = pyvips.Image.new_from_file(vips_filename=fpath, level=level)
region = pyvips.Region.new(vips_image)
patch_fetch = region.fetch(location[0], location[1], size[0], size[1])
patch_fetch = np.ndarray(buffer=patch_fetch,
                         dtype=np.uint8,
                         shape=(size[1], size[0], bands))

# do they mach?
print('fetch vs openslide', np.allclose(patch_fetch, patch_os), (patch_fetch - patch_os).mean())
print('crop vs openslide',np.allclose(patch_crop, patch_os), (patch_crop - patch_os).mean())
fetch vs openslide False 121.11172485351562
crop vs openslide True 0.0

Lets see what these look like

# Let's see what they look like
plt.figure(figsize=(15, 5))

plt.subplot(1, 3, 1)
plt.imshow(patch_os)
plt.title('openslide')
plt.axis('off')

plt.subplot(1, 3, 2)
plt.imshow(patch_crop)
plt.title('crop')
plt.axis('off')

plt.subplot(1, 3, 3)
plt.imshow(patch_fetch)
plt.title('fetch')
plt.axis('off')

Screenshot 2024-02-08 at 6 16 34 PM

System details

import platform; print(platform.platform())
import sys; print('python', sys.version)
import pyvips; print('pyvips', pyvips.__version__)
import openslide; print('openslide', openslide.__version__)
macOS-10.16-x86_64-i386-64bit
python 3.8.18 (default, Sep 11 2023, 08:17:33) 
[Clang 14.0.6 ]
pyvips 2.2.1
openslide 1.3.0

I installed pyvips using conda install.

jcupitt commented 9 months ago

Hi @idc9,

openslide reads as RGBA by default, though the A is unused. You'll need to read as 4-byte pixels, then (probably?) drop the A in numpy.

Alternatively, if you give new_from_file the rgb flag, you can get RGB directly.

Something like:

# read patch via fetch
bands = 3
vips_image = pyvips.Image.new_from_file(vips_filename=fpath, level=level, rgb=True)
region = pyvips.Region.new(vips_image)
patch_fetch = region.fetch(location[0], location[1], size[0], size[1])
patch_fetch = np.ndarray(buffer=patch_fetch,
                         dtype=np.uint8,
                         shape=(size[1], size[0], bands))
jcupitt commented 9 months ago

You can make flips and rotates at the same time, which can be useful, eg.:

#!/usr/bin/python3

# makes 12,512 patches in 0.5s from CMU-1-Small-region.svs

import sys
import pyvips

def fetch(region, patch_size, x, y):
    return region.fetch(patch_size * x, patch_size * y, patch_size, patch_size)

image[0] = pyvips.Image.new_from_file(sys.argv[1], rgb=True)
image[1] = image0.rot90()
image[2] = image0.rot180()
image[3] = image0.rot270()

image[4] = image[0].fliphor()
image[5] = image[4].rot90()
image[6] = image[4].rot180()
image[7] = image[4].rot270()

reg = [pyvips.Region.new(x) for x in image]

patch_size = 64
n_across = image0.width // patch_size
n_down = image0.height // patch_size
x_max = n_across - 1
y_max = n_down - 1

n_patches = 0
for y in range(0, n_down):
    print("row {} ...".format(y))
    for x in range(0, n_across):
        patch0 = fetch(reg[0], patch_size, x, y)
        patch1 = fetch(reg[1], patch_size, y_max - y, x)
        patch2 = fetch(reg[2], patch_size, x_max - x, y_max - y)
        patch3 = fetch(reg[3], patch_size, y, x_max - x)

        patch4 = fetch(reg[4], patch_size, x_max - x, y)
        patch5 = fetch(reg[5], patch_size, y_max - y, x_max - x)
        patch6 = fetch(reg[6], patch_size, x, y_max - y)
        patch7 = fetch(reg[7], patch_size, y, x)

        n_patches += 8

print("{} patches generated".format(n_patches))

You have to step carefully to exploit caching.

Usually, fetch is only quick for small patches (eg. 32x32 pixels). If you need larger tiles (512x512?), then crop will often be faster. You'll need to do some quick tests on your hardware.

idc9 commented 9 months ago

Thanks for the quick response!

4 bands works!

Ah I see so to read in the RGBA image you need to change bands = 3 to bands = 4.

bands = 4 # 4 not 3 since RGBA!
vips_image = pyvips.Image.new_from_file(vips_filename=fpath, level=level)
region = pyvips.Region.new(vips_image)
patch_fetch = region.fetch(location[0], location[1], size[0], size[1])
patch_fetch = np.ndarray(buffer=patch_fetch,
                         dtype=np.uint8,
                         shape=(size[1], size[0], bands))
patch_fetch = patch_fetch[:,: , 0:3]

this worked! I.e. it matches openslide

Issue with rgb argument

I would like use the rgb argument however I get an error

vips_image = pyvips.Image.new_from_file(vips_filename=fpath, level=level, rgb=True)
---------------------------------------------------------------------------
Error                                     Traceback (most recent call last)
Cell In[4], line 1
----> 1 vips_image = pyvips.Image.new_from_file(vips_filename=fpath, level=level, rgb=True)

File ~/anaconda3/envs/cpath/lib/python3.8/site-packages/pyvips/vimage.py:352, in Image.new_from_file(vips_filename, **kwargs)
    349     raise Error('unable to load from file {0}'.format(vips_filename))
    350 name = _to_string(pointer)
--> 352 return pyvips.Operation.call(name, filename,
    353                              string_options=options, **kwargs)

File ~/anaconda3/envs/cpath/lib/python3.8/site-packages/pyvips/voperation.py:288, in Operation.call(operation_name, *args, **kwargs)
    285 for name in kwargs:
    286     if (name not in intro.optional_input and
    287             name not in intro.optional_output):
--> 288         raise Error('{0} does not support optional argument {1}'
    289                     .format(operation_name, name))
    291     value = kwargs[name]
    292     details = intro.details[name]

Error: VipsForeignLoadOpenslideFile does not support optional argument rgb
jcupitt commented 9 months ago

I would like use the rgb argument however I get an error

You probably have an old libvips. The rgb option was only added a few years ago.