Open jonasteuwen opened 5 months ago
Hi @jonasteuwen,
pyvips lets you fetch any libvips metadata with get
. For example:
image = pyvips.Image.new_from_file("CMU-1.svs")
profile = image.get("icc-profile-data")
You can see all the metadata that libvips can read for a file with vipsheader
, for example:
$ vipsheader -a CMU-1.svs | grep icc
openslide.icc-size: 141992
icc-profile-data: 141992 bytes of binary data
The icc_transform
operation in pyvips can pick up the metadata profile, so you could write:
image = pyvips.Image.new_from_file("CMU-1.svs")
srgb = image.icc_transform("srgb")
And it'll combine the slide profile with a standard srgb profile to generate a corrected sRGB image.
openslide makes RGBA images by default, though the A is almost always just 255. If you pass the rgb
option to new_from_file
it'll read plain RGB instead, which can give a very useful speedup.
image = pyvips.Image.new_from_file("CMU-1.svs", rgb=True)
Ah you want to just fetch and process a small region, is that right? You could write:
image = pyvips.Image.new_from_file("CMU-1.svs", rgb=True).icc_transform("srgb")
for y in range(0, image.height, 256):
for x in range(0, image.width, 256):
tile = image.crop(x, y, min(256, image.width - x), min(256, image.height - y))
rgb_pixel_array = tile.numpy()
do_something_with_the_tile_data(rgb_pixel_array)
libvips is threaded and demand-driven, so it'll be efficient.
Hi @jcupitt,
Thank you for your prompt reply. In my code, I have two backends: pyvips directly, which will work as you do (thanks for the example, that's much more efficient!), and a fork of openside-python where instead of outputting it to a PIL Image, pass it to a pyvips image. See here:
https://github.com/NKI-AI/dlup/blob/feature/libvips/dlup/backends/openslide_backend.py https://github.com/NKI-AI/dlup/blob/feature/libvips/dlup/experimental_backends/pyvips_backend.py.
When using the openslide C library, you can get the icc profile as BytesIO stream as shown above, and I want to use those to create an icc_transform that I want to apply to your rgb_pixel_array
.
I would imagine something like this:
owsi = openslide_lowlevel.open(str(filename))
profile = openslide_lowlevel.read_icc_profile(owsi)
color_profile = io.BytesIO(profile)
for y in range(0, image.height, 256):
for x in range(0, image.width, 256):
tile = owsi.read_region((x, y), level, (min(256, image.width - x), min(256, image.height - y))).icc_transform("srgb", input_profile=color_profile)
rgb_pixel_array = tile.numpy()
do_something_with_the_tile_data(rgb_pixel_array)
Note that I modified the .read_region() of the openslide library to output a pyvips.Image.
OpenSlide attaches it to the PIL image when reading the region: https://github.com/openslide/openslide-python/blob/22978715366db4ef1a3ebaab49c514131617fe66/openslide/__init__.py#L255
Can we do the same using this profile BytesIO?
You can attach the profile from openslide_lowlevel
as metadata to the pyvips image. Something like (untested):
owsi = openslide_lowlevel.open(str(filename))
profile = openslide_lowlevel.read_icc_profile(owsi)
color_profile = io.BytesIO(profile).read()
tile = owsi.read_region((x, y), level, (min(256, image.width - x), min(256, image.height - y)))
# attach profile to image as metadata
tile.set_type(pyvips.GValue.blob_type, "icc-profile-data", color_profile)
tile = tile.icc_transform("srgb")
Though performance might not be that great -- image = pyvips.Image.new_from_file("CMU-1.svs", rgb=True).icc_transform("srgb")
will probably be a lot quicker (but I've not benchmarked it).
Why do you need two backends?
@jcupitt Thank you! I will give it a try!
Different backends: I found there are some minor differences between how pyvips reads the images and openslide reads them (one of them the output being RGB/RGBA or so) and maybe some interpolation. I don't know why, the ssim > 0.999 but np.allclose(a,b) is not true. While I use pyvips for new projects, I wanted to make sure that our older projects based on openslide remain producing the same outputs for the same data when they update the library.
Problem
Currently pyvips only supports reading ICC profiles from a file as far as I can see. OpenSlide gives an io.BytesIO output. I have modified openslide-python to output
pyvips.Image
.Code example
So right now you can do this:
With
PIL
you can now do this:This does not seem to be possible with pyvips and I need to dump color_profile to disk?