MIC-DKFZ / nnUNet

Apache License 2.0
5.79k stars 1.74k forks source link

Optional Torch Multiprocessing in nnUNet for Improved Security and Compatibility #2556

Open LennyN95 opened 1 week ago

LennyN95 commented 1 week ago

Dear nnUNet team,

We are currently facing challenges when running nnUNet in Docker containers. The requirement to use flags like --ipc=host or --shm-size for torch.multiprocessing (as reported and discussed in previous issues) makes it difficult to deploy models in our mhub.ai platform. We hesitate to suggest the use of --ipc=host, which is a simple solution but removes security restrictions and should therefore be used with caution. On the other hand, manually specifying the shm-size means an additional burden and makes the inference even more complicated from MHub's point of view.

We currently have some contributions to our portfolio on hold to discuss this topic. In our particular case, the inference is executed sequentially and therefore does not require multiprocessing per se. We propose to make the use of torch.multiprocessing optional during inference.

We welcome any comments and an open discussion on this topic.

Thank you very much! Leo.

surajpaib commented 4 days ago

Hi Team,

I would also like to address this issue but from the perspective of user debugging. I think having an option to set num_workers=0 in the inference dataloader would allow users to better debug error traces that occur.

I see a few issues that would probably benefit from this as well: https://github.com/MIC-DKFZ/nnUNet/issues/2509 https://github.com/MIC-DKFZ/nnUNet/issues/2514 https://github.com/MIC-DKFZ/nnUNet/issues/2182

Looking forward to hear what you think

@Lars-Kraemer