MIC-DKFZ / nnUNet

Apache License 2.0
5.95k stars 1.77k forks source link

About resuming pre-processing #2513

Closed Luffy03 closed 1 month ago

Luffy03 commented 2 months ago

Hi, many thanks for your great work. I want to know if I have a large-scale dataset and I want to pre-process it by the nnunet pipeline. But the processing is always stuck due to some problems with my device. In this case, how can we resume the pre-processing and continue processing the remained data? Otherwise, we have to process the whole dataset once again.

Lars-Kraemer commented 2 months ago

Hey @Luffy03,

unfortunately, there is no way to continue the preprocessing because it cannot be ensured that no corrupt files were created by the abort. You can use nnUNetv2_preprocess instead of nnUNetv2_plan_and_preprocess to skip the fingerprint extraction. Additionally, you can use the -c option to use only the required configuration to shorten the preprocessing time. With the -np option you can reduce the number of processes to use less RAM, as this is often a reason for the preprocessing to abort. (See nnUNetv2_preprocess -h). If none of this helps, you can split your dataset manually into several small datasets and run the preprocessing for each one, and then reassemble them later. However, you have to make sure that the nnUNet plans are created on the whole dataset to avoid inconsistencies.

Best, Lars

Luffy03 commented 2 months ago

Thank you so much for your prompt response! It aligns perfectly with what I'm currently working on (lol). It seems this is the only way to solve it. And I also want to express my heartfelt appreciation for your outstanding work.