libvips / pyvips

python binding for libvips using cffi
MIT License
628 stars 49 forks source link

High RAM usage with fetch when reading multiple WSI tiles #447

Open GuillaumeBalezo opened 7 months ago

GuillaumeBalezo commented 7 months ago

Hello, I'm working on a Torch dataset that directly reads tiles using pyvips from multiple WSI histopathological images in any given sequence. This means the first tile could be from slide 1 and the next one from slide 2, and so on. To access the tiles within the slide, I'm using 'fetch' from pyvips.Region instead of 'crop' on a pyvips.Image, which is faster. However, I'm experiencing an increase in RAM usage when using 'fetch', an issue I don't encounter with 'crop'.

Here is a simple code that reproduces my problem (with pyvips 2.2.1 and libvips 8.15):

from pathlib import Path
import pyvips

# dataframe is a pandas Dataframe with columns:
# - slide_name: slide_id to map with corresponding pyvips Image in
#     images_dict
# - x: x coordinate of the tile in the puvips.Image
# - y: y coordinate of the tile int the pyvips.Image
# The tiles were all selected in the tissue and are ordered by slides and
# horizontally from top left to bottom right

TILE_SIZE = 512
images_dict = {}
for slide_path in he_paths:
    image = pyvips.Image.new_from_file(slide_path, subifd=-1, access="sequential")
        slide_name = Path(slide_path).stem
    images_dict[slide_name] = image

for idx, row in tqdm(dataframe.iterrows(), total=len(dataframe)):
    slide_name = row["slide_name"]
    image = images_dict[slide_name]
    region = pyvips.Region.new(image)
    x, y = row["x"], row["y"]
    buffer = region.fetch(x, y, TILE_SIZE, TILE_SIZE)
    del region, buffer

Here is my RAM usage when using fetch:

ram

The RAM increase seems to happen when I start loading tiles from a new slide (easy to detect because in my example the tiles are ordered by slides). However, when I use the crop function, this problem doesn't occur anymore.

for idx, row in tqdm(dataframe.iterrows(), total=len(dataframe)):
    slide_name = row["in_slide_name"]
    image = images_dict[slide_name]
    x, y = row["x"], row["y"]
    tile = image.crop(x, y, TILE_SIZE, TILE_SIZE).numpy()
    del tile

Currently I can’t use my torch dataset with numerous slides or multiworkers. Could you help me understand what might be causing this issue? Also, is there a way to use fetch without causing an increase in RAM usage?

Thanks!

jcupitt commented 7 months ago

Hi @GuillaumeBalezo,

fetch leaves the libvips pipeline open after it runs, so the next fetch operation doesn't need to rebuild everything. This helps latency for small regions, but it will have higher memory use.

crop runs a libvips pipeline, including startup, shutdown and minimize, so as much memory as possible is released.

I'm a bit surprised you're seeing very high memuse, I wouldn't have thought it'd make that much difference. I'll see if I can make a test program that shows this effect.

What WSI format are you using?

jcupitt commented 7 months ago

I tried this:

#!/usr/bin/env python3

import sys
import random
import pyvips

images = [pyvips.Image.new_from_file(filename, rgb=True)
          for filename in sys.argv[1:]]
regions = [pyvips.Region.new(image)
           for image in images]

#print(f"with fetch:")
#for i in range(100):
#    print(f"loop {i} ...")
#
#    for image, region in zip(images, regions):
#        x = random.randint(0, image.width - 512)
#        y = random.randint(0, image.height - 512)
#        data = region.fetch(x, y, 512, 512) 

print(f"with crop:")
for i in range(100):
    print(f"loop {i} ...")

    for image in images:
        x = random.randint(0, image.width - 512)
        y = random.randint(0, image.height - 512)
        data = image.crop(x, y, 512, 512).numpy()

On this laptop with 23 large SVS images I see:

$ ls *.svs | wc
     23      31     721
$ /usr/bin/time -f %M:%e ~/try/wsi-fetch.py *.svs
with fetch:
loop 0 ...
loop 1 ...
...
loop 99 ...
4310996:31.39
$ /usr/bin/time -f %M:%e ~/try/wsi-fetch.py *.svs
with crop:
loop 0 ...
loop 1 ...
...
loop 99 ...
2507136:14.69

So I think crop is likely to be faster for you. fetch can be quicker for very small tiles (eg 32 x 32), and if you are generating derivatives in pyvips.