How to fasten the inference speed?

yangshunDragon commented 3 years ago

Very nice project and tools, Thanks for sharing! I am wondering how to improve the inference speed. We make a test with 40 3D images on v100 GPU(32GB), it spend 56 seconds per image which is slow for us. Secondly, We want to decrease the model size to use less momory and make inference faster without too much performance drop. What do you think about it?

yangshunDragon commented 3 years ago

@christianpayer BTW, We want to use less GPU memory, because we have 6GB GPU memory limit. While we found that it used 9.1GB memory on vertebrae localization phase and 5GB memory on vertebrae segmentation phase, more than 6 GB.

christianpayer commented 3 years ago

Hi, thanks for using our code!

Regarding inference runtime, the algorithm and framework are not optimized at all. I think a bottleneck there is the image loading, preprocessing (gaussian smoothing and resampling) and postprocessing (resampling), which is done on the CPU. Most of these operations could be implemented on the GPU. Unfortunately, the framework uses SimpleITK for image processing, which only supports CPU. But you could try to implement the required operations (gaussian smoothing, resizing) on the GPU which should already make it much faster.

Regarding the memory requirements: We implemented the algorithm for 12 GB GPUs. You could try to reduce the memory requirements by using smaller image sizes or resolutions. If you are willing to retrain the networks, you could try to increase the 'image_spacing', while at the same time reduce the 'image_size'. For vertebrae localization, you could also try to reduce the image size during inference only. For this, reduce the resolution in the line 'self.max_image_size_for_cropped_test = [128, 128, 448]' in the inference script. The image will then be cropped and processed in a sliding window approach for images that do not fit this size.

christianpayer / MedicalDataAugmentationTool-VerSe

How to fasten the inference speed? #17