MIC-DKFZ / batchgenerators

A framework for data augmentation for 2D and 3D image classification and segmentation
Apache License 2.0
1.09k stars 221 forks source link

Support for NVIDIA Dali? #46

Closed adhusch closed 4 years ago

adhusch commented 5 years ago

Hi,

are there any plans to support the Nvidia Dali library for fast GPU based augmentation in the future?

(https://github.com/NVIDIA/DALI, https://www.basicml.com/performance/2019/04/16/pytorch-data-augmentation-with-nvidia-dali)

Best Andy

adhusch commented 5 years ago

Ah, guess will be longer way as i just learned it doesnt support 3d yet, sorry.

FabianIsensee commented 5 years ago

Hi Andy, I was considering using GPU data augmentation before but concluded that this is not necessary, at least for us. My view on this is the following: Each ms the GPU is not spending training is time wasted. Also we typically have extremely severe GPU memory limitations and everything that takes up additional GPU memory is not desired. I have yet to find an application where a well-designed CPU data augmentation pipeline is not fast enough for the GPU. We also design our entire GPU infrastructure with this in mind (6 physical CPU cores per GPU is our minimum, except the dgx2 which has really low CPU power :-( but even that is enough). CPU is not too expensive, especially if you don't get the top of the line models (and if you go AMD, but we have not yet managed to do that because things here move too slowly...). If you are reasonable and properly optimize the code (and if you use batchgenerators of course :-D) then I currently don't see a need for this. That doesn't mean that I am completely opposed to it. It is just that I am currently doing most of the work for batchgenerators and I don't have time for this right now. If you wish to prepare a pull request I'd be happy to merge it. Best, Fabian