Project-MONAI / MONAI

AI Toolkit for Healthcare Imaging
https://monai.io/
Apache License 2.0
5.87k stars 1.09k forks source link

Overview of pathology naming #3868

Closed drbeh closed 2 years ago

drbeh commented 2 years ago

Create and overview of different pathology components and their naming structure.

drbeh commented 2 years ago

MONAI Pathology-specific Components

Type Subtype Component Backend Transition Path
data datasets PatchWSIDataset - It can be extended to use any derivative of ImageReader class; however, some considerations are required to avoid unnecessary caching for backend libraries with intrinsic cache (like OpenSlide). Also reading a subregion can be generalized to include other image types and readers.
data datasets SmartCachePatchWSIDataset - With a generalized PatchWSIDataset, this can be removed.
data datasets MaskedInferenceWSIDataset - This can be generalized with a sampling on a low-resolution binary mask, which could work with any datasets and dataloader. It is based on the common concept of image level in digital pathology but that can be abstracted to create a uniform API.
data image_reader WSIReader OpenSlide, cucim.CuImage, TiffFile Already in MONAI Core
handlers ProbMapProducer numpy.ndarray It has room to improvements like multi-GPU support; however, making it generalizable to non-pathology images requires many changes and additional helper classes.
transforms spatial SplitOnGrid numpy.ndarray, torch.Tensor This SplitOnGrid and TileOnGrid have very much in common; however, merging them is not trivial since they have different implementations with different efficiencies. Also the combination make the API more complicated with so many options and situations to handle. Need further evaluation of combining approach. Otherwise, we should make the differences more clear and separate their functionalities as much as possible.
transforms spatial TileOnGrid numpy.ndarray
transforms color In pathology, color deals with stains, so we can have either transforms/color or transforms/stain; however, to make it fit to a broader biomedical imaging context, color might be a better choice. Also in case we add any transform that manipulate color profiles.
transforms color HEStainExtractor numpy.ndarray It is worth to explore the posiibility of having the same API as scikit-image or even moving this component there. Efficiency needs to be taken into account for this components.
transforms color StainNormalizer numpy.ndarray It can be generalized to any dimensional color image with three channel color
transforms utility CuCIM cuCIM transforms Already in MONAI Core
transforms utility RandCuCIM randomized cuCIM transforms Already in MONAI Core
metrics FROC LesionFROC numpy.ndarray, torch.Tensor Specific to 2D color images but there is no fundamental barrier to make it 3D
metrics FROC compute_fp_tp_probs numpy.ndarray, torch.Tensor Already in MONAI Core
metrics FROC compute_froc_curve_data numpy.ndarray, torch.Tensor Already in MONAI Core
metrics FROC compute_froc_score numpy.ndarray, torch.Tensor Already in MONAI Core
networks nets TorchVisionFCModel - Already in MONAI Core
wyli commented 2 years ago

Thanks for the summary, would be great to specify the proposed source code file location and namespaces. For example, for monai's LMDBDataset module, it is in

monai/data/dataset.py

and it is accessible via

from monai.data import LMDBDataset
from monai.data.dataset import LMDBDataset

do you envision any alternative submodule aliases for the same module? e.g. monai.data.wsi.PatchDataset and where to put the actual files, it seems monai/data/dataset.py is quite lengthy...? cc @ericspod

ericspod commented 2 years ago

Adding more subdirectories is the way to go to organise and make explicit what components they have, so monai/data/wsi and monai/transforms/color and others make sense. Many of our existing transform files may do with reorganisation along thematic lines like this anyway so it won't be out of place.

drbeh commented 2 years ago

@wyli @ericspod, I totally agree with you. monai/transforms/color sounds good to me but let me check on monai/data/wsi to see if there is a way to abstract away WSI part and make it patch-based datasets monai/data/patch. I will check and keep you posted.

drbeh commented 2 years ago

@wyli @ericspod, on a separate note, since there are many interconnected changes, does it make sense to create a new branch for pathology (then merge it after changes are done)?

drbeh commented 2 years ago

@wyli @ericspod, I totally agree with you. monai/transforms/color sounds good to me but let me check on monai/data/wsi to see if there is a way to abstract away WSI part and make it patch-based datasets monai/data/patch. I will check and keep you posted.

I evaluated the components and realized the specific needs of pathology datasets are more tied to the whole slide image rather than being patch based, so I think monai/data/wsi/datasets.py an appropriate place to host these datasets. Although patches can be used in radiology workflows too, here we are dealing with patches from images that cannot be loaded into memory. I will preparing a diagram of the pathology component structure to have a better overall view.

drbeh commented 2 years ago

Refer to #4005 for more details.