MIC-DKFZ / MedNeXt

[MICCAI 2023] MedNeXt is a fully ConvNeXt architecture for 3D medical image segmentation.
https://arxiv.org/pdf/2303.09975
Apache License 2.0
324 stars 24 forks source link

mednextv1_train_DDP can not work #24

Closed lianjiejie closed 2 months ago

lianjiejie commented 2 months ago

Hi, I try to ues mednextv1_train_DDP, the commend like this: mednextv1_train_DDP 3d_fullres nnUNetTrainerV2_MedNeXt_M_kernel3 500 1 -p nnUNetPlansv2.1_trgSp_1x1x1 but it can not work, It cannot find the class nnUNetTrainerV2_DDP correctly according to the command.

############################################### I am running the following nnUNet: 3d_fullres My trainer class is: <class 'nnunet_mednext.training.network_training.MedNeXt.nnUNetTrainerV2_MedNeXt.nnUNetTrainerV2_MedNeXt_M_kernel3'> For that I will be using the following configuration: num_classes: 48 modalities: {0: 'CT'} use_mask_for_norm OrderedDict([(0, False)]) keep_only_largest_region None min_region_size_per_class None min_size_per_class None normalization_schemes OrderedDict([(0, 'CT')]) stages...

stage: 0 {'batch_size': 2, 'num_pool_per_axis': [4, 5, 5], 'patch_size': [128, 128, 128], 'median_patient_size_in_voxels': array([196, 204, 204]), 'current_spacing': array([1.66107814, 1.66107814, 1.66107814]), 'original_spacing': array([1., 1., 1.]), 'do_dummy_2D_data_aug': False, 'pool_op_kernel_sizes': [[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2], [1, 2, 2]], 'conv_kernel_sizes': [[3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]]}

stage: 1 {'batch_size': 2, 'num_pool_per_axis': [4, 5, 5], 'patch_size': [128, 128, 128], 'median_patient_size_in_voxels': array([325, 339, 339]), 'current_spacing': array([1., 1., 1.]), 'original_spacing': array([1., 1., 1.]), 'do_dummy_2D_data_aug': False, 'pool_op_kernel_sizes': [[2, 2, 2], [2, 2, 2], [2, 2, 2], [2, 2, 2], [1, 2, 2]], 'conv_kernel_sizes': [[3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3], [3, 3, 3]]}

I am using stage 1 from these plans I am using batch dice + CE loss

I am using data from this folder: /data/lianzejie/mednext_data/preprocessed/Task500_bone_seg/nnUNetData_plans_v2.1_trgSp_1x1x1 ############################################### Traceback (most recent call last): File "/data/lianzejie/envs/mednext/bin/mednextv1_train_DDP", line 33, in sys.exit(load_entry_point('mednextv1', 'console_scripts', 'mednextv1_train_DDP')()) File "/home/lianzejie/mednext/nnunet_mednext/run/run_training_DDP.py", line 149, in main trainer = trainer_class(plans_file, fold, local_rank=args.local_rank, output_folder=output_folder_name, File "/home/lianzejie/mednext/nnunet_mednext/training/network_training/MedNeXt/nnUNetTrainerV2_MedNeXt.py", line 25, in init super().init(*args, **kwargs) TypeError: init() got an unexpected keyword argument 'local_rank' (mednext) lianzejie@aa-SYS-4029GP-TRT:~$

saikat-roy commented 2 months ago

I'm sorry but I've not used nnUNet v1's DDP in my work and can't unfortunately support issues with it. However, I recommend you adopt the architecture for nnUNetv2 (currently the main version of nnUNet) and use it's DDP, which is much easier to use.