MIC-DKFZ / nnUNet

Apache License 2.0
5.71k stars 1.73k forks source link

How does nnunet use multi-modal brats data for training, and what does it need to do with data and labels? #1596

Closed dafhafhajfajfna closed 1 year ago

dafhafhajfajfna commented 1 year ago

I encountered a problem in the multimodal splitting part of data preprocessing. How can I fuse the 4 modal data (t1, t2, t1ce, flair) in the Brats dataset and make a json file so that each modal data can be correctly split in the preprocessing part?

FabianIsensee commented 1 year ago

Have you seen this script already? https://github.com/MIC-DKFZ/nnUNet/blob/master/nnunetv2/dataset_conversion/Dataset137_BraTS21.py It's all in there :-)

jeb2112 commented 1 year ago

Is it possible to train the Unet on BraTS using only two of the four input channels? I ran training sessions with only T1ce and FLAIR in the imagesTr directory, and this completed without error. But when trying to make a prediction on my test data, I seem to require all four input channels in the imagesTs directory, even though the model was trained with two.

saikat-roy commented 1 year ago

@jeb2112 nnUNet can be customized to the number of input channels that you want to use on any dataset. Can you check if both your training and test data were converted properly to use the same number of channels and that this is reflected properly in the dataset json?

If you do not notice something wrong doing the above, please share the error log during predicting on your test set.

jeb2112 commented 1 year ago

@jeb2112 nnUNet can be customized to the number of input channels that you want to use on any dataset. Can you check if both your training and test data were converted properly to use the same number of channels and that this is reflected properly in the dataset json?

If you do not notice something wrong doing the above, please share the error log during predicting on your test set.

OK it might be something about how I used the dataset.json. I did not edit it to remove the two unused channels during training, I only removed the unused channel images from imagesTr. I was prepared to edit dataset.json, but since I wasn't seeing any errors and training appeared to complete normally, I didn't edit it. Then for testing, I got a long trace of errors but this particular one was the clue:

RuntimeError: Given groups=1, weight of size [32, 4, 3, 3, 3], expected input[1, 2, 128, 160, 112] to have 4 channels, but got 2 channels instead.

and I resolved that by restoring the unused channel images to the imagesTs directory. All training and test data were converted with the provided script noted above. However, I didn't notice any parameter in this script relating to the number of channels. I can see how they are hard-coded and I could have edited the script to only convert two channels, but instead I just converted all four channels and then removed the ones I didn't want.

So I think the solution is just to edit dataset.json at the start of the training run to match the images that are present in imagesTr? If so thanks! and I'll do that on my next iteration.

saikat-roy commented 1 year ago

Have you seen this script already? https://github.com/MIC-DKFZ/nnUNet/blob/master/nnunetv2/dataset_conversion/Dataset137_BraTS21.py It's all in there :-)

@jeb2112 Maybe we are talking about the same thing but I'll still mention it just in case - if I were you, I would create the dataset.json already with 2 channels during dataset coversion as Fabian pointed out above, and not edit it after creation - less chances for errors.

jeb2112 commented 1 year ago

Got it! and thanks again.