MIC-DKFZ / nnUNet

Apache License 2.0
5.84k stars 1.75k forks source link

Mixed precision error during prediction #312

Closed perecanals closed 4 years ago

perecanals commented 4 years ago

Hi again Fabian,

I got this error when doing inference on some (I suspect large) images and I think it might be a bug:


Fabian Isensee, Paul F. Jäger, Simon A. A. Kohl, Jens Petersen, Klaus H. Maier-Hein "Automated Design of Deep Learning Methods for Biomedical Image Segmentation" arXiv preprint arXiv:1904.08128 (2020).
If you have questions or suggestions, feel free to open an issue at https://github.com/MIC-DKFZ/nnUNet

using model stored in  /content/drive/My Drive/TFM/nnunet_env/nnUNet/nnunet/nnUNet_base/nnUNet_training_output_dir/nnUNet/3d_lowres/Task100_grid/nnUNetTrainerV2__nnUNetPlansv2.1
This model expects 1 input modalities for each image
Found 1 unique case ids, here are some examples: ['11414155']
If they don't look right, make sure to double check your filenames. They must end with _0000.nii.gz etc
number of cases: 1
number of cases that still need to be predicted: 1
emptying cuda cache
loading parameters for folds, [3]
using the following model files:  ['/content/drive/My Drive/TFM/nnunet_env/nnUNet/nnunet/nnUNet_base/nnUNet_training_output_dir/nnUNet/3d_lowres/Task100_grid/nnUNetTrainerV2__nnUNetPlansv2.1/fold_3/model_final_checkpoint.model']
starting preprocessing generator
starting prediction...
preprocessing /content/drive/My Drive/TFM/nnunet_env/nnUNet/nnunet/inference_test/output_nnUNetTrainerV2_Adam_fold_2/11414155.nii.gz
using preprocessor GenericPreprocessor
before crop: (1, 828, 512, 512) after crop: (1, 828, 512, 512) spacing: [0.4000001 0.46875   0.46875  ] 

no separate z, order 3
no separate z, order 1
before: {'spacing': array([0.4000001, 0.46875  , 0.46875  ]), 'spacing_transposed': array([0.46875  , 0.4000001, 0.46875  ]), 'data.shape (data is transposed)': (1, 512, 828, 512)} 
after:  {'spacing': array([1.23372493, 1.14846823, 1.23372493]), 'data.shape (data is resampled)': (1, 195, 288, 195)} 

(1, 195, 288, 195)
This worker has ended successfully, no errors to report
predicting /content/drive/My Drive/TFM/nnunet_env/nnUNet/nnunet/inference_test/output_nnUNetTrainerV2_Adam_fold_2/11414155.nii.gz
Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Traceback (most recent call last):
  File "/usr/local/bin/nnUNet_predict", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/nnunet/inference/predict_simple.py", line 221, in main
    step_size=step_size, checkpoint_name=args.chk)
  File "/usr/local/lib/python3.6/dist-packages/nnunet/inference/predict.py", line 631, in predict_from_folder
    segmentation_export_kwargs=segmentation_export_kwargs)
  File "/usr/local/lib/python3.6/dist-packages/nnunet/inference/predict.py", line 217, in predict_cases
    mixed_precision=mixed_precision)[1][None])
TypeError: predict_preprocessed_data_return_seg_and_softmax() got an unexpected keyword argument 'mixed_precision'```
FabianIsensee commented 4 years ago

Hi, can you please pip install --upgrade nnunet and see if if works? Best, Fabian

perecanals commented 4 years ago

Thank you for your attention. Unfortunately, I did and I'm still getting the same error. Any idea on what's happening?

FabianIsensee commented 4 years ago

Hi there, are you certain you ran pip install --upgrade nnunet? What is the text output of this command? Are you using a custom trainer class? If so then you need to reimplement it to comply with the new interface. Your text suggests that you are still using apex. I have removed apex and use pytorch amp now. Changing you code should be a piece of cake Best, Fabian

perecanals commented 4 years ago

Ah, that must be it then! Let me change the code around and see if it works. Thanks!

perecanals commented 4 years ago

That was it! Smooth as butter, thank you Fabian!