rapidsai / cucim

cuCIM - RAPIDS GPU-accelerated image processing library
https://docs.rapids.ai/api/cucim/stable/
Apache License 2.0
328 stars 56 forks source link

[QST] Read and process whole-slide images #283

Open Gabry993 opened 2 years ago

Gabry993 commented 2 years ago

What is your question? Hello, I'm pretty new to this topic, but I was wondering whether this library can be used to process wsi. I would need to join, for example, different .svs files into one, after rotating and translating them. Would I be able to use the "skimage-like" functions provided by your library directly on images that don't fit into memory?

I've tried to run this simple code just to answer my question:

from cucim import CuImage
import math
from cucim.skimage import transform

slide = CuImage("/data/FSI-1185830-_174819.svs")
print(img.is_loaded)
print(img.device)
print(img.ndim)
print(img.dims) 
print(img.shape)
tform = transform.SimilarityTransform(scale=1, rotation=math.pi/4,
                                      translation=(img.shape[0]/2, -100))
rotated = transform.warp(img, tform)

but it fails with this output:

img.is_loaded True
img.device cpu
img.ndim 3
img.dims YXC
img.shape [88093, 121407, 3]
Traceback (most recent call last):
  File "git.py", line 18, in <module>
    rotated = transform.warp(img, tform)
  File "/usr/local/lib/python3.8/dist-packages/cucim/skimage/transform/_warps.py", line 931, in warp
    if image.dtype.kind == "c":
AttributeError: 'cucim.clara._cucim.DLDataType' object has no attribute 'kind'

Also, I'm struggling to find examples to better understand how to use the library with large images, so I would be grateful if you could point me in the right direction. Thank you!

gigony commented 2 years ago

Hi @Gabry993 ,

Please see the following welcome page to see how you can use it https://nbviewer.org/github/rapidsai/cucim/blob/branch-22.06/notebooks/Welcome.ipynb

Before feeding to scikit-image APIs, image object (which is CuImage type) needs to be converted to numpy(or cupy) array

img = cupy.asarray(img)
# if you want to use scikit image
img = numpy.asarray(img)
from cucim import CuImage
import math
from cucim.skimage import transform
import cupy as cp

slide = CuImage("CMU-1-Small-Region.svs")
img = slide.read_region((0, 0), (1024, 1024))
print(img.is_loaded)
print(img.device)
print(img.ndim)
print(img.dims)
print(img.shape)
tform = transform.SimilarityTransform(scale=1, rotation=math.pi/4,
                                      translation=(img.shape[0]/2, -100))
img = cp.asarray(img)
rotated = transform.warp(img, tform)
print(rotated.shape)
True
cpu
3
YXC
[1024, 1024, 3]
(1024, 1024, 3)

Also, I'm struggling to find examples to better understand how to use the library with large images, so I would be grateful if you could point me in the right direction.

If you need to process large images, patches (parts of the whole-slide image) need to be loaded and processed.

The following example can give you a way to load/process small parts of the image.

https://nbviewer.org/github/rapidsai/cucim/blob/branch-22.06/notebooks/Multi-thread_and_Multi-process_Tests.ipynb#Multithreading

To overcome (GPU) memory size issue, after processing each patch, you can copy the processed patch image into a large (numpy) array (with the target image size) in CPU.

If CPU memory is also not enough to handle the image size, You can use numpy's mmap feature. You can refer to https://github.com/rapidsai/cucim/blob/branch-22.06/python/cucim/src/cucim/clara/converter/tiff.py which uses openslide/tifffile and mmap to convert image (that is not supported by cuCIM) into generic TIFF image.

If you need to have a giant image and want to process distributed manner, you can exploit Dask Array: https://docs.dask.org/en/stable/array.html

Please let me know if you have any other questions.

Thank you!