MIC-DKFZ / nnUNet

Apache License 2.0
5.9k stars 1.76k forks source link

Some questions about workernum #1907

Closed xl-lei closed 6 months ago

xl-lei commented 10 months ago

Hello, I would like to ask how nnUNet adjusts the number of concurrent processes and threads during training. Are there any relevant environment variables? I also want to know the meaning of OMP_NUM_THREADS, nnUNet_def_n_proc, default_num_processes. Thank you very much!

sten2lu commented 9 months ago

Hi @xinglianglei,

thanks for your question, the multiprocessing in nnU-Net indeed is an essential factor and can be a bit overwhelming. So, I will try to break it down for you:

  1. OMP_NUM_THREADS is an environment variable that is not specific to Python but is related to OpenMP (Open Multi-Processing). The variable is used to control the number of threads created by OpenMP-enabled parallel code. Therefore, the OMP_NUM_THREADS environment variable can be set to specify the maximum number of threads that OpenMP should use during execution. As Numpy uses OpenMP, setting it to a higher value might speed up several of its operations during preprocessing, postprocessing and training.
  2. nnUNet_def_n_proc is an environment variable that is specific to nnU-Net and overwrites the default_num_processes variable in python
  3. default_num_processes is a variable used in nnU-Net as a default number of workers used for various operations, s.a., numbers of workers during preprocessing, extraction of fingerprints etc.
  4. nnUNet_n_proc_DA is an environment variable that sets the amount of workers used to load data during training.

I hope this answers your question. If you want to go more into detail, I would advise you to familiarize yourself with the codebase itself and read the corresponding parts of the documentation (e.g., for OMP_NUM_THREADS and Numpy).

Best regards,

Carsten