MIC-DKFZ / nnUNet

Apache License 2.0
5.71k stars 1.73k forks source link

Region based training with more then one cluster #1952

Closed KanielDatz closed 7 months ago

KanielDatz commented 8 months ago

First of all, thank you for your effort. I'm a 5th-year medicine student, and nnUNet has helped me a lot on several of my projects.

I'm trying to utilize nnUNet for a case where I have five different elements that might overlay each other. Currently, I have made a dataset with nine labels: one for each element and four more for the common overlay options. This setup works pretty well, and after 4500 epochs, it achieves a DICE score of 0.73 on the test set, which is good for this dataset.

I discovered the option to use region-based training, and I want to try it, but I'm not sure how to build the dataset.json. I tried what @coendevente explained in his multilabel 'workaround' in issue #653 , but I'm not sure how to represent several objects in the JSON file.

I've done preprocessing and 1200 epochs of training with the following JSON:

{
      "channel_names": {
          "0": "channel0"
      },
      "labels": {
          "background": 0,
          "catheter": [1,6,7,9],
          "deflated_device": 2,
          "inflated_device": 3,
          "vessel": [4,6,8,9],
          "wire": [5,7,8,9]

      },
      "regions_class_order": [[1, 6, 7, 9], 2, 3, [4, 6, 8, 9], [5, 7, 8, 9]],

      "numTraining": 20676,
      "file_ending": ".bmp",
      "name": "Dataset006_region_class"

During training, I achieve up to 0.8 DICE for the validation set. However, when calling nnunetv2_train , it runs the prediction as normal, but when it finishes with the last file, it encounters an error and stops. No files are saved, and I get the following error:


...
...
Predicting 0101.00.13.0030:
perform_everything_on_gpu: True
Prediction done, transferring to CPU if needed
prediction saved
sending off prediction to background worker for resampling and export
done with 0101.00.13.0030

Predicting 0101.00.13.0045:
perform_everything_on_gpu: True
Prediction done, transferring to CPU if needed
prediction saved
sending off prediction to background worker for resampling and export
done with 0101.00.13.0045
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
  File "C:\Users\cathalert\.conda\envs\medseg\lib\multiprocessing\pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "C:\Users\cathalert\.conda\envs\medseg\lib\multiprocessing\pool.py", line 51, in starmapstar
    return list(itertools.starmap(args[0], args[1]))
  File "C:\Users\cathalert\.conda\envs\medseg\lib\site-packages\nnunetv2\inference\export_prediction.py", line 39, in export_prediction_from_softmax
    segmentation = label_manager.convert_logits_to_segmentation(predicted_array_or_file)
  File "C:\Users\cathalert\.conda\envs\medseg\lib\site-packages\nnunetv2\utilities\label_handling\label_handling.py", line 182, in convert_logits_to_segmentation
    return self.convert_probabilities_to_segmentation(probabilities)
  File "C:\Users\cathalert\.conda\envs\medseg\lib\site-packages\nnunetv2\utilities\label_handling\label_handling.py", line 173, in convert_probabilities_to_segmentation
    segmentation[predicted_probabilities[i] > 0.5] = c
ValueError: NumPy boolean array indexing assignment cannot assign 4 input values to the 0 output values where the mask is true
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "C:\Users\cathalert\.conda\envs\medseg\lib\runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\cathalert\.conda\envs\medseg\lib\runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "C:\Users\cathalert\.conda\envs\medseg\Scripts\nnUNetv2_predict.exe\__main__.py", line 7, in <module>
  File "C:\Users\cathalert\.conda\envs\medseg\lib\site-packages\nnunetv2\inference\predict_from_raw_data.py", line 533, in predict_entry_point
    predict_from_raw_data(args.i,
  File "C:\Users\cathalert\.conda\envs\medseg\lib\site-packages\nnunetv2\inference\predict_from_raw_data.py", line 355, in predict_from_raw_data
    [i.get() for i in r]
  File "C:\Users\cathalert\.conda\envs\medseg\lib\site-packages\nnunetv2\inference\predict_from_raw_data.py", line 355, in <listcomp>
    [i.get() for i in r]
  File "C:\Users\cathalert\.conda\envs\medseg\lib\multiprocessing\pool.py", line 771, in get
    raise self._value
ValueError: NumPy boolean array indexing assignment cannot assign 4 input values to the 0 output values where the mask is true

what I did is manually save the logits files just after prediction = prediction.to('cpu').numpy() in the predictor class.

can you please instruct me how the dataset.json file should look in my case? before i start changing your code, are you familiar with what couses this error?

What I did was manually save the logits files just after 'prediction = prediction.to('cpu').numpy()' in the predictor class.

Before I start changing your code, are you familiar with what could be causing this error? Also, could you please instruct me on how the dataset.json file should look in my case?

FabianIsensee commented 8 months ago

Hey Daniel, so the problem arises from "regions_class_order". What this is supposed to be is a list of integer values which mirrors the regions defined in 'labels'. Essentially it's a 'what label should I replace each of the regions with in what order' kind of question. Maybe it's easier to explain that with the BraTS dataset because it more closely reflects the intended purpose of regions: hierarchical labels. In Brats we have the classes 1='edema', 3='necrosis' and 2='enhancing tumor'. The labels look like this:

    "labels": {
        "background": 0,
        "whole tumor": [1, 2, 3],
        "tumor core": [2, 3],
        "enhancing tumor": 3,
    },
"regions_class_order": [1, 2, 3],

The three entries in regions_class_order correspond to the three regions defined in "labels" in the order they are defined there. What nnU-Net will do it take the 'thole tumor' region, replace all pixels with '1'. Then it takes the "tumor core" region and replaces all labels with 2 and finally it sets all "enhancing tumor" predictions to 3. So what you need to do with regions_class_order is define in what order which semantic label needs to be placed to reconstruct the original segmentation. Previous values will hereby always be overwritten (such as enhancing tumor overwrites tumor core and whole tumor). Maybe this helps as well: https://github.com/MIC-DKFZ/nnUNet/blob/master/documentation/region_based_training.md Best, Fabian

KanielDatz commented 8 months ago

So, if understood you well, in my case it should look like this:

{
    "channel_names": {
        "0": "channel0"
    },
    "labels": {
        "background": 0,
        "catheter": [1,6,7,9],
        "deflated_device": 2,
        "inflated_device": 3,
        "vessel": [4,6,8,9],
        "wire": [5,7,8,9]

    },
    "regions_class_order": [1,2,3,4,5],

    "numTraining": 20676,
    "file_ending": ".bmp",
    "name": "Dataset006_region_class"

so at first it will mark [1,6,7,9] as 1, then 2 and 3 as themselves, [4,6,8,9] as 4 etc.. should i ignore the background (0) label?

I probably need to run the training from scratch with the right json, but anyway I tried to change it in the nnunet_results folder, in the already trained model and got:

Predicting 0001.00.22.0015:
perform_everything_on_gpu: True
output is either too large for python process-process communication or all export workers are busy. Saving temporarily to file...

I will update when I finish training with the right json, thanks again!

------ edit:

thought about it a little more, if I understand you well: as in the regular nnunet mode - on output I'll get a 2d image (from 2d input) so - I need the overlapping labels to be at the top .

FabianIsensee commented 8 months ago

Hey, yes exactly. Is everything working now as it should? You dont need to retrain your model. Replacing the dataset.json is enough -> regions_class_order is only used when exporting images in the final validation and in nnUNet_predict. it is not used in training