pyvips.Image.new_from_array from np.mmap vs hdf5

rgluskin commented 3 months ago

Hi. I'm performing manipulations on WSIs and saving intermediate results in HDF5. The end results are being saved to a new WSI using pyvips.Image.new_from_array.

However when running on very large slides I get OOM exceptions.

I did a quick experiment and saw that saving the intermediate results in np.mmap doesn't cause this issue. From pyvips code it seems that the difference stems from __array_interface__ vs __array__ attributes (mmap has both but hdf5 only has the latter). However the fields that are actually accessed are fields such as dtype which are present in hdf5 anyway.

What would be your recommendation ? Should I rewrite my whole code to use mmap ? Or is there a more elegant approach ?

Thanks in advance.

jcupitt commented 3 months ago

Hi @rgluskin,

Could you share some sample code that shows the problem? How are you saving in HDF5? Is this with write_to_file('xxx.mat'), or some numpy feature?

As far as I know, HDF5 is not a good format for large images, it needs huge amounts of memory. I would use (maybe) uncompressed pyramidal tiled TIFF, or perhaps jpeg-compressed, depending on the image size.

Are you sure you need to save intermediates? If you stick to pyvips, you can do most processing with no need for intermediates. What operation is forcing you to save?

rgluskin commented 3 months ago

I'm running various processing, including classical vision and deep learning pipelines. HDF5 is used to store the results on disk, not in memory. This is the gist of the code, I think it's enough to reproduce the issue.

with h5py.File(hdf5_fname, "w") as heatmap_file:
    heatmap_file.create_dataset(
        "value_map",
        (self.height, self.width, n_channels),
        dtype=dtype,
        chunks=self.HDF5_CHUNK_SIZE + (n_channels,),
        compression="gzip",
        fillvalue=0.0,
    )
self.heatmap_file = h5py.File(hdf5_fname, "r+")
self.value_map = self.heatmap_file["value_map"]
...
...

full_scale_image = self.value_map
...
...
# Tif resolution is pixels/cm, openslide resolution is um/pixel. VIPS interprets the parameter as pixels/mm
vips_res = 1e3 / resolution_umpp
pyvips.Image.new_from_array(full_scale_image, interpretation="rgb").write_to_file(file_path, pyramid=True, tile=True, bigtiff=True, xres=vips_res, yres=vips_res, compression="lzw")

Disk size is not an issue, only having to reallocate the whole image to RAM is (when return cls.new_from_array(obj.__array__()) is executed in vimage.py )

jcupitt commented 3 months ago

HDF5 is used to store the results on disk, not in memory.

I think loading HDF5 from disc will need a lot of memory, won't it? And it'll use amazing amounts of disc space -- your WSI scans will probably have been through jpeg already, so there's little chance of quality loss, I'd think.

You need to make a complete test program I can run that shows the problem. Otherwise I'll waste 30 minutes making a test myself, do something slightly different from your code, and not hit the same issue. Or that's what usually happens to me :(

rgluskin commented 3 months ago

Sorry here's the full code to reproduce:

import h5py
import pyvips

hdf5_fname = "temp.hdf5"
file_path = "temp.tiff"
resolution_umpp = 0.25
height = 100000
width = 192000
n_channels = 4
dtype = 'uint8'
HDF5_CHUNK_SIZE = (128, 128)

with h5py.File(hdf5_fname, "w") as heatmap_file:
    heatmap_file.create_dataset(
        "value_map",
        (height, width, n_channels),
        dtype=dtype,
        chunks=HDF5_CHUNK_SIZE + (n_channels,),
        compression="gzip",
        fillvalue=0.0,
    )
heatmap_file = h5py.File(hdf5_fname, "r+")
value_map = heatmap_file["value_map"]

full_scale_image = value_map

# Tif resolution is pixels/cm, openslide resolution is um/pixel. VIPS interprets the parameter as pixels/mm
vips_res = 1e3 / resolution_umpp
pyvips.Image.new_from_array(full_scale_image, interpretation="rgb").write_to_file(
    file_path, pyramid=True, tile=True, bigtiff=True, xres=vips_res, yres=vips_res, compression="lzw"
)

rgluskin commented 3 months ago

and this snippet works

import h5py
import numpy as np
import pyvips

hdf5_fname = "temp.hdf5"
file_path = "temp.tiff"
resolution_umpp = 0.25
height = 100000
width = 192000
n_channels = 4
dtype = 'uint8'

value_map = np.memmap(hdf5_fname, dtype=dtype, mode='w+', shape=(height, width, n_channels))

full_scale_image = value_map

# Tif resolution is pixels/cm, openslide resolution is um/pixel. VIPS interprets the parameter as pixels/mm
vips_res = 1e3 / resolution_umpp
pyvips.Image.new_from_array(full_scale_image, interpretation="rgb").write_to_file(
    file_path, pyramid=True, tile=True, bigtiff=True, xres=vips_res, yres=vips_res, compression="lzw"
)

My default course of action would be to refactor all such hdf5 usages to use np.memmap instead. Unless you see a more elegant solution (i.e. somehow recognize that shape and dtype are present in hdf5 and read it from there)

jcupitt commented 3 months ago

Great! Thanks for that. I tried:

$ VIPS_PROGRESS=1 ./rgluskin.py 
mapping ...
making vips image ...
writing ...
rgluskin.py temp-2: 192000 x 100000 pixels, 32 threads, 192000 x 1 tiles, 640 lines in buffer
rgluskin.py temp-2: 24% complete
...

And watched it run in top. It allocated 78gb of VIRT at the start, then as the save executed, RES slowly creept up until, when the save ticker reached 100%, it was equal to VIRT.

I think this is probably unavoidable with HDF5 files opened via numpy. Does it have to be HDF5? TIFF (for example) should work well, eg. with your test file in TIFF format I see:

$ VIPS_PROGRESS=1 /usr/bin/time -f %M:%e  vips copy temp.tiff x.tif[tile,pyramid,compression=jpeg]
vips temp-6: 192000 x 100000 pixels, 32 threads, 128 x 128 tiles, 640 lines in buffer
vips temp-6: done in 151s          
memory: high-water mark 1.02 GB
1368380:151.39

1.3gb of peak memory use. Q85 jpeg should be no worse than LZW (assuming your WSI scanner outputs something like SVS), and much faster.

rgluskin commented 3 months ago

No, direct numpy works fine without hdf5. It has some other flexibility limitations, such as having to store a single tensor per file but I can refactor my code around that.

On Thu, 8 Aug 2024, 19:34 John Cupitt, @.***> wrote:

Great! Thanks for that. I tried:

$ VIPS_PROGRESS=1 ./rgluskin.py mapping ... making vips image ... writing ... rgluskin.py temp-2: 192000 x 100000 pixels, 32 threads, 192000 x 1 tiles, 640 lines in buffer rgluskin.py temp-2: 24% complete ...

And watched it run in top. It allocated 78gb of VIRT at the start, then as the save executes, RES slowly creeps up until, when the save ticker reaches 100%, it's equal to VIRT.

I think this is probably unavoidable with HDF5 files opened via numpy. Does it have to be HDF5? TIFF (for example) should work well, eg. with your test file in TIFF format I see:

$ VIPS_PROGRESS=1 /usr/bin/time -f %M:%e vips copy temp.tiff x.tif[tile,pyramid,compression=jpeg] vips temp-6: 192000 x 100000 pixels, 32 threads, 128 x 128 tiles, 640 lines in buffer vips temp-6: done in 151s memory: high-water mark 1.02 GB 1368380:151.39

1.3gb of peak memory use. Q85 jpeg should be no worse than LZW (assuming your WSI scanner outputs something like SVS), and much faster.

— Reply to this email directly, view it on GitHub https://github.com/libvips/pyvips/issues/492#issuecomment-2276231675, or unsubscribe https://github.com/notifications/unsubscribe-auth/AH4Y3IYGTMLBUZA2XSWN6ZTZQOMZ7AVCNFSM6AAAAABMGINPM2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENZWGIZTCNRXGU . You are receiving this because you were mentioned.Message ID: @.***>

libvips / pyvips

pyvips.Image.new_from_array from np.mmap vs hdf5 #492