ENHANCE-PET / MOOSE

MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research.The pipeline is based on nn-UNet and has the capability to segment 120 unique tissue classes from a whole-body 18F-FDG PET/CT image.
https://enhance.pet
GNU General Public License v3.0
196 stars 28 forks source link

Feat: Reduce memory requirement for MOOSE during inference #33

Closed LalithShiyam closed 1 year ago

LalithShiyam commented 2 years ago

Problem MOOSE is based on nnUNet and the current inference takes a lot of memory on total-body datasets (uEXPLORER/QUADRA, upper limit: 256 GB). This is not a normal memory usage for most of the users. The memory usage bottleneck is explained here: https://github.com/MIC-DKFZ/nnUNet/issues/896

Solution The solution seems to be to find a 'faster/memory efficient' resampling scheme than the skimage resampling scheme. People have already suggested solutions for speed, based on https://pytorch.org/docs/stable/generated/torch.nn.functional.interpolate.html and an elaborate description can be found here: https://github.com/MIC-DKFZ/nnUNet/issues/1093.

But the memory consumption is still a problem. @dhaberl @Keyn34 : Consider these alternative options of Nvidia's cuCIM cucim.skimage.transform.resize in combination with Dask for block processing (chunks consume way less memory and I have used this for kinetic modelling).

Impact This would result in a faster inference time and hopefully also obviates memory bottleneck for MOOSE and for any model inference via nnUNet.

dhaberl commented 2 years ago

Here are some benchmark values I got for a CT with size (512, 512, 768) and spacing (1.5234375, 1.5234375, 2.0): Resampled to size (520, 520, 1024) and spacing (1.5, 1.5, 1.5):

cucim.resize: 0.85 (0.09) s [on GPU] SimpleITK.resample: 7.65 (0.12) s [on CPU with sitk.ProcessObject.SetGlobalDefaultNumberOfThreads(MAX_THREADS)] skimage.resize: 54.85 (0.77) s [default as I could not find any multithreading option] scipy.zoom: 61.59 (0.50) s [default as I could not find any multithreading option]

Values are mean (std) for n=10 runs

Interpolation type: Bi-cubic (= order 3): cucim, skimage, scipy sitk.sitkBSpline: SimpleITK

LalithShiyam commented 2 years ago

cuCIM it is! feel free to open a PR.

LalithShiyam commented 1 year ago

sorted with dask! check out moosev2. pip install moosez