Open amtsak opened 1 year ago
Hi @amtsak,
That's interesting, I've never looked at pickling (obviously). I'll see if I can fix the error.
I fixed the obvious issue, but it's still not working. With git master pyvips I now see:
$ python3
Python 3.10.7 (main, Nov 24 2022, 19:45:47) [GCC 12.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pyvips
>>> import pickle
>>> x = pyvips.Image.new_from_file("CMU-1.svs")
>>> pickle.dump(x, open("x.pickle", "w"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot pickle '_cffi_backend._CDataBase' object
>>>
I guess ffi does not support pickle in a straightforward way.
I had a quick read and I don't think pyvips could ever support pickle. An image in pyvips is (behind the scenes) a pointer owned by a huge C library, and represents, not data, but a tree of pointers to functions. There's not really anything that can be portably serialised.
If the images are small, you could maybe convert them to numpy arrays. They are simple data objects and pickle easily.
Thanks for looking into it @jcupitt! Makes sense that this wouldn't be pickled. Thanks for the quick response either way.
I had a thought: perhaps you could just pass filenames into the multiprocessing function and do new_from_file
there? It's very fast on slide images, so there's no real benefit to passing the image.
Thanks @jcupitt, that's one of the first things I tried, but even though it's fast, it does add some 0.03-0.06 secs of overhead (on my setup at least for a 3GB pyramid slide image) which can add up if the volume of region reads is high enough, so I'd rather avoid it. But it could be an option.
Yes, you'd need to generate quite a few tiles in the multiprocessing method to overcome the cost of the new_from_file
.
Hi,
I am also trying to apply multiprocessing and extract and save patches in a quicker manner. I am posting issue here since it looks relevant to this converstion.
Note: I am already using fetch method which has improved some speed but it still a lot of time for some bigger files. I have read some other open/close issues but could not find the solution.
Here is the code:
from multiprocessing import Pool
import os
import pyvips as vips
from PIL import Image
import time
import matplotlib.pyplot as plt
import numpy as np
location = "/home/...."
file = "ABC.ndpi"
path = "/home/....."
patch_folder = "/home/......../patches/"
num_workers = 4
def fetch(region, patch_size, x, y):
return region.fetch(patch_size * x, patch_size * y, patch_size, patch_size)
def create_patches(location, file, path,
patch_folder, workers=1, patch_size=224, mask_overlap= 70.0):
file_pth = os.path.join(location, file)
img_400x = vips.Image.new_from_file(file_pth, level=0, autocrop=True).flatten()
w, h = img_400x.width, img_400x.height
n_across = int(w/patch_size)
n_down = int(h/patch_size)
mask_path = os.path.join(path, "#binary#mask.png")
# function suppose to parallelize and save patch based on the mask.
def extract_and_save_patch(x_cord, y_cord, file_path=file_pth, file_name=file,
mask_path=mask_path, patch_folder=patch_folder, patch_size=patch_size,
mask_overlap=mask_overlap):
img_400x = vips.Image.new_from_file(file_path, page=0,
autocrop=True).flatten()
w, h = img_400x.width, img_400x.height
mask_w, mask_h = Image.open(mask_path).size
resized_mask = vips.Image.new_from_file(mask_path).flatten() \
.colourspace("b-w").resize(w/mask_w, vscale=h/mask_h)
img_400x = vips.Region.new(img_400x)
resized_mask = vips.Region.new(resized_mask)
patch_mask = fetch(resized_mask, patch_size, x_cord, y_cord)
patch_mask = np.ndarray(buffer=patch_mask, dtype=np.uint8, shape=[patch_size, patch_size])
if np.mean(patch_mask/255)*100 > mask_overlap:
patch = fetch(img_400x, patch_size, x_cord, y_cord)
patch = np.ndarray(buffer=patch, dtype=np.uint8, shape=[patch_size, patch_size, 3])
fname = file_name.split(".")[0]
x_start, y_start = x_cord*patch_size, y_cord*patch_size
base_name = f"{fname}_{x_start}_{y_start}.png"
patch_pil = Image.fromarray(patch)
patch_pil.save(os.path.join(patch_folder, base_name))
list_cord = [(x, y) for x in range(0, n_across) for y in range(0, n_down)]
with Pool(processes=workers) as p:
p.starmap(extract_and_save_patch, list_cord)
create_patches(location, file, path, patch_folder, workers=num_workers)
I am passing paths instead of vips object but still getting the same error. Here is the trace.
Traceback (most recent call last):
File "multiprocess.py", line 135, in <module>
create_patches(location, file, path, patch_folder, workers=num_workers)
File "multiprocess.py", line 127, in create_patches
p.starmap(extract_and_save_patch, list_cord)
File "/home/neel/miniconda3/envs/process/lib/python3.7/multiprocessing/pool.py", line 276, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "/home/neel/miniconda3/envs/process/lib/python3.7/multiprocessing/pool.py", line 657, in get
raise self._value
File "/home/neel/miniconda3/envs/process/lib/python3.7/multiprocessing/pool.py", line 431, in _handle_tasks
put(task)
File "/home/neel/miniconda3/envs/process/lib/python3.7/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/home/neel/miniconda3/envs/process/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'create_patches.<locals>.extract_and_save_patch'
Do you have any suggestions on this or how I can parallilize this through pyvips.
Neel
There's an example using multiprocessing here:
https://github.com/libvips/pyvips/blob/master/examples/thumb-dir.py
Does that help?
I'll have a look at your code.
Ah your code fails because you have a local function. You can only use global functions with pickle.
Try making extract_and_save_patch
into a top level definition.
Thank @jcupitt for quick reply. :)
I tried to define the function on top but passing many arguments was creating some serialization error.
I solved the problem by writting "global extract_and_save_patch" before function definition inside create_patches.
It reduced the patching time signifiantly using 16 CPUs. but I was wondering if can reduce the overhead of loading file and mask everytime (and resizing it everytime). I wish there was a way to pass pyvips object directly to function.
I'd swap this code:
mask_w, mask_h = Image.open(mask_path).size
resized_mask = vips.Image.new_from_file(mask_path).flatten() \
.colourspace("b-w").resize(w/mask_w, vscale=h/mask_h)
For vips.thumbnail()
, it should be much faster. What format are your mask images in?
What are you writing? Just a simple RGB array? You can use rawsave
to write these from pyvips.
Crop would probably be faster than fetch if your patches are 250x250.
Masks are grayscale images saved in png format. These masks are smaller than original image to make it easy to view. These masks correspond to tissue which will only be used for creating patches.
I am extracting RGB patches of 224*224, to run a ML model over them. I will swtich and test crop to compare.
I tried to test the program on a selected WSI and mask with 16 cpu workers.
Using fetch, it takes 26.4 seconds for extracting and saving 1830 tiles of 224x224. Using crop, it takes 21.6 seconds for extracting and saving 1830 tiles of 224x224.
It looks like fetch works better for ndpi pyramid images in my case.
Thanks a lot.
You could probably make that a lot faster. Post a sample mask image and the size of the ndpi you are benchmarking with and I'll have a go at tuning it.
Thanks John. :)
The ndpi WSI is around 90 MB and I want to use 40x here. Here is an example png mask.
I made a version of your original code for testing:
#!/usr/bin/python3
from multiprocessing import Pool
import os
import pyvips as vips
from PIL import Image
import time
import matplotlib.pyplot as plt
import numpy as np
import sys
def fetch(region, patch_size, x, y):
return region.fetch(patch_size * x, patch_size * y, patch_size, patch_size)
# function suppose to parallelize and save patch based on the mask.
def extract_and_save_patch(x_cord, y_cord,
slide_filename, mask_filename,
patch_size, mask_overlap):
slide = vips.Image.new_from_file(slide_filename,
autocrop=True).flatten()
w, h = slide.width, slide.height
mask_w, mask_h = Image.open(mask_filename).size
resized_mask = vips.Image.new_from_file(mask_filename).flatten() \
.colourspace("b-w").resize(w / mask_w, vscale=h / mask_h)
img_400x = vips.Region.new(slide)
resized_mask = vips.Region.new(resized_mask)
patch_mask = fetch(resized_mask, patch_size, x_cord, y_cord)
patch_mask = np.ndarray(buffer=patch_mask, dtype=np.uint8, shape=[patch_size, patch_size])
if np.mean(patch_mask / 255) * 100 > mask_overlap:
patch = fetch(img_400x, patch_size, x_cord, y_cord)
patch = np.ndarray(buffer=patch, dtype=np.uint8, shape=[patch_size, patch_size, 3])
x_start, y_start = x_cord * patch_size, y_cord * patch_size
patch_pil = Image.fromarray(patch)
patch_pil.save(f"patches/{x_start}_{y_start}.png")
slide_filename = sys.argv[1]
patch_size = 224
img_400x = vips.Image.new_from_file(slide_filename, autocrop=True)
w, h = img_400x.width, img_400x.height
n_across = int(w / patch_size)
n_down = int(h / patch_size)
params = [(x, y, slide_filename, "binary-mask.png", patch_size, 70)
for x in range(0, n_across) for y in range(0, n_down)]
with Pool(processes=4) as p:
p.starmap(extract_and_save_patch, params)
I changed it slightly, but the runtime should be the same. On this PC I see:
$ time ./mask.py ~/pics/openslide/CMU-1.ndpi
real 0m43.588s
user 2m35.253s
sys 0m14.278s
$ ls patches/ | wc
6754 6754 73361
So about 157 patches per second. I tried tweaking it a bit:
#!/usr/bin/python3
from multiprocessing import Pool
import pyvips as vips
import sys
def extract_and_save_patch(x, y,
slide_filename, mask_filename,
patch_size, mask_overlap):
slide = vips.Image.new_from_file(slide_filename, autocrop=True, rgb=True)
mask = vips.Image.new_from_file(mask_filename)
mask = mask.resize(slide.width / mask.width, vscale=slide.height /
mask.height, kernel="nearest")
patch_mask = mask.crop(x * patch_size, y * patch_size,
patch_size, patch_size)
if patch_mask.avg() / 2.55 > mask_overlap:
patch = slide.crop(x * patch_size, y * patch_size,
patch_size, patch_size)
patch.write_to_file(f"patches/{x}_{y}.png")
slide_filename = sys.argv[1]
patch_size = 224
slide = vips.Image.new_from_file(slide_filename, autocrop=True)
n_across = int(slide.width / patch_size)
n_down = int(slide.height / patch_size)
params = [(x, y, slide_filename, "binary-mask-mono.png", patch_size, 70)
for x in range(0, n_across) for y in range(0, n_down)]
with Pool(processes=4) as p:
p.starmap(extract_and_save_patch, params)
That runs in 34s (199 patches/second). If I set processes=32
(this PC has 16 cores) I get 10s (675 patches/second).
Oh, if I put the x loop inside the thread, like this:
#!/usr/bin/python3
from multiprocessing import Pool
import pyvips as vips
import sys
def extract_and_save_patch(y,
slide_filename, mask_filename,
patch_size, mask_overlap):
slide = vips.Image.new_from_file(slide_filename, autocrop=True, rgb=True)
mask = vips.Image.new_from_file(mask_filename)
mask = mask.resize(slide.width / mask.width, vscale=slide.height /
mask.height, kernel="nearest")
n_across = int(slide.width / patch_size)
for x in range(n_across):
patch_mask = mask.crop(x * patch_size, y * patch_size,
patch_size, patch_size)
if patch_mask.avg() / 2.55 > mask_overlap:
patch = slide.crop(x * patch_size, y * patch_size,
patch_size, patch_size)
patch.write_to_file(f"patches/{x}_{y}.png")
slide_filename = sys.argv[1]
patch_size = 224
slide = vips.Image.new_from_file(slide_filename, autocrop=True)
n_down = int(slide.height / patch_size)
params = [(y, slide_filename, "binary-mask-mono.png", patch_size, 70)
for y in range(0, n_down)]
with Pool(processes=4) as p:
p.starmap(extract_and_save_patch, params)
I get 18s (375 patches/second). If I also set processes=32
I get 4.5s (1500 patches/second).
Thanks a lot, Now it a more faster (with last code, loop inside the thread)
Although, it does not work that faster (compared to yours) on my linux server but total time now reduced
with 32 CPUS: it takes 15.6s, ultimately giving 118 patches/second. with 16 CPUs: it takes 19.2s, giving 96 patches/second
The only change in my version is rgb=True in vips.Image.new_from_file gives vips.Error may be due to a different version of vips. So I am using flatten() instead. I hope that wont be the bigger issues causing overhead. I am using pyvips 2.2.0
You need 8.14 for the rgb
mode.
Is there scope to support pickle dumping for pyvips Images in the future? Currently, it fails - see minimal example to reproduce below:
Have tried this with .svs, .ndpi and .tif slides.
The error with python 3.10 & pyvips 2.2.1:
The error with python 3.7 & pyvips 2.2.0:
Allowing pickle dumping would make pyvips Images compatible with the latest pytorch Dataloaders (>= 1.13.1) that unfortunately pickle dump all their attributes when doing any type of multiprocessing. So it would make life easier for several ML applications.