Closed jordandekraker closed 3 years ago
Ah no not broken, forgot to mention you'll have to uninstall nnunet and reinstall hippunfold since it relies on a different fork of nnunet (already in the setup.py)
On Mon., Feb. 8, 2021, 6:05 p.m. jordandekraker, notifications@github.com wrote:
I notice the cpu_inference branch is still open, but commits were also merged to the master branch. Not sure if that was the intended behaviour, but for now this this seems to be breaking the master branch. The same command works perfectly fine with the --use_gpu flag added.
Command:
hippunfold BigBrain/bidsT1 BigBrain/bidsT1_unfolded participant --modality T1w --skip_preproc --skip_coreg
--profile cc-slurm
Relevant info from slurm log file:
rule run_inference:
input: work/sub-001/anat/sub-001_hemi-R_space-corobl_desc-cropped_T1w.nii.gz, /home/jdekrake/.cache/hippunfold/trained_model.3d_fullres.Task101_hcp1200_T1w.nnUNetTrainerV2.model_best.tar output: work/sub-001/seg_T1w/sub-001_hemi-R_space-corobl_desc-nnunet_dseg.nii.gz jobid: 6 wildcards: subject=001, modality=T1w, hemi=R threads: 16 resources: mem_mb=32000, gpus=0, time=60
�[33mJob counts:
count jobs
1 run_inference
1�[0m
'work/sub-001/anat/sub-001_hemi-R_space-corobl_desc-cropped_T1w.nii.gz' -> 'tempimg/temp_0000.nii.gz'
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_3/
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_3/debug.json
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_3/progress.png
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_3/model_latest.model.pkl
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_3/training_log_2021_1_26_13_22_01.txt
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_3/model_best.model.pkl
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_3/model_latest.model
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_3/model_best.model
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/plans.pkl
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_2/
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_2/debug.json
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_2/training_log_2021_1_26_13_20_12.txt
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_2/progress.png
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_2/model_latest.model.pkl
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_2/model_best.model.pkl
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_2/model_latest.model
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_2/model_best.model
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_0/
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_0/debug.json
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_0/progress.png
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_0/model_latest.model.pkl
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_0/model_best.model.pkl
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_0/model_latest.model
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_0/model_best.model
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_0/training_log_2021_1_26_13_44_33.txt
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_4/
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_4/debug.json
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_4/progress.png
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_4/model_latest.model.pkl
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_4/model_best.model.pkl
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_4/model_latest.model
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_4/model_best.model
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_4/training_log_2021_1_26_13_43_41.txt
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_1/
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_1/debug.json
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_1/training_log_2021_1_26_13_20_12.txt
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_1/progress.png
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_1/model_latest.model.pkl
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_1/model_best.model.pkl
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_1/model_latest.model
nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_1/model_best.model
Please cite the following paper when using nnUNet:
Isensee, F., Jaeger, P.F., Kohl, S.A.A. et al. "nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation." Nat Methods (2020). https://doi.org/10.1038/s41592-020-01008-z
If you have questions or suggestions, feel free to open an issue at https://github.com/MIC-DKFZ/nnUNet
nnUNet_raw_data_base is not defined and nnU-Net can only be used on data for which preprocessed files are already present on your system. nnU-Net cannot be used for experiment planning and preprocessing like this. If this is not intended, please read nnunet/paths.md for information on how to set this up properly.
nnUNet_preprocessed is not defined and nnU-Net can not be used for preprocessing or training. If this is not intended, please read nnunet/pathy.md for information on how to set this up.
using model stored in tempmodel/nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1
This model expects 1 input modalities for each image
Found 1 unique case ids, here are some examples: ['temp']
If they don't look right, make sure to double check your filenames. They must end with _0000.nii.gz etc
number of cases: 1
number of cases that still need to be predicted: 1
emptying cuda cache
loading parameters for folds, None
folds is None so we will automatically look for output folders (not using 'all'!)
found the following folds: ['tempmodel/nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2nnUNetPlansv2.1/fold_0', 'tempmodel/nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2nnUNetPlansv2.1/fold_1', 'tempmodel/nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2nnUNetPlansv2.1/fold_2', 'tempmodel/nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2nnUNetPlansv2.1/fold_3', 'tempmodel/nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_4']
using the following model files: ['tempmodel/nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2nnUNetPlansv2.1/fold_0/model_best.model', 'tempmodel/nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_1/model_best.model', 'tempmodel/nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2nnUNetPlansv2.1/fold_2/model_best.model', 'tempmodel/nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_3/model_best.model', 'tempmodel/nnUNet/3d_fullres/Task101_hcp1200_T1w/nnUNetTrainerV2__nnUNetPlansv2.1/fold_4/model_best.model']
starting preprocessing generator
starting prediction...
preprocessing templbl/temp.nii.gz
using preprocessor GenericPreprocessor
before crop: (1, 128, 256, 128) after crop: (1, 128, 256, 128) spacing: [0.30000001 0.30000001 0.30000001]
no resampling necessary
no resampling necessary
before: {'spacing': array([0.30000001, 0.30000001, 0.30000001]), 'spacing_transposed': array([0.30000001, 0.30000001, 0.30000001]), 'data.shape (data is transposed)': (1, 128, 256, 128)}
after: {'spacing': array([0.30000001, 0.30000001, 0.30000001]), 'data.shape (data is resampled)': (1, 128, 256, 128)}
(1, 128, 256, 128)
This worker has ended successfully, no errors to report
predicting templbl/temp.nii.gz
Traceback (most recent call last):
File "/scratch/jdekrake/hippdev/venv/bin/nnUNet_predict", line 8, in
sys.exit(main())
File "/scratch/jdekrake/hippdev/venv/lib/python3.7/site-packages/nnunet/inference/predict_simple.py", line 221, in main
step_size=step_size, checkpoint_name=args.chk)
File "/scratch/jdekrake/hippdev/venv/lib/python3.7/site-packages/nnunet/inference/predict.py", line 634, in predict_from_folder
segmentation_export_kwargs=segmentation_export_kwargs)
File "/scratch/jdekrake/hippdev/venv/lib/python3.7/site-packages/nnunet/inference/predict.py", line 214, in predict_cases
trainer.load_checkpoint_ram(p, False)
File "/scratch/jdekrake/hippdev/venv/lib/python3.7/site-packages/nnunet/training/network_training/network_trainer.py", line 356, in load_checkpoint_ram
self.amp_grad_scaler.load_state_dict(checkpoint['amp_grad_scaler'])
AttributeError: 'NoneType' object has no attribute 'load_state_dict'
�[32m[Mon Feb 8 17:21:20 2021]�[0m
�[31mError in rule run_inference:�[0m
�[31m jobid: 0�[0m
�[31m output: work/sub-001/seg_T1w/sub-001_hemi-R_space-corobl_desc-nnunet_dseg.nii.gz�[0m
�[31m shell:
mkdir -p tempmodel tempimg templbl && cp -v work/sub-001/anat/sub-001_hemi-R_space-corobl_desc-cropped_T1w.nii.gz tempimg/temp_0000.nii.gz && tar -xvf /home/jdekrake/.cache/hippunfold/trained_model.3d_fullres.Task101_hcp1200_T1w.nnUNetTrainerV2.model_best.tar -C tempmodel && export RESULTS_FOLDER=tempmodel && export nnUNet_n_proc_DA=16 && nnUNet_predict -i tempimg -o templbl -t Task101_hcp1200_T1w -chk model_best --disable_tta && cp -v templbl/temp.nii.gz work/sub-001/seg_T1w/sub-001_hemi-R_space-corobl_desc-nnunet_dseg.nii.gz (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)�[0m
�[31m�[0m
�[31mExiting because a job execution failed. Look above for error message�[0m
HIPPUNFOLD_CACHE_DIR not defined, using default location
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
HIPPUNFOLD_CACHE_DIR not defined, using default location
HIPPUNFOLD_CACHE_DIR not defined, using default location
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/khanlab/hippunfold/issues/33, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXV2XPJE7JG3343CCCI4JTS6BU4PANCNFSM4XJ4UNJQ .
Working now with a clean venv.
However, I'm now trying to run the pipeline using --modality segT2w
with the following error:
hippunfold Manual/Maastricht/bids/ unfolding/Maastricht_manualseg participant --modality segT2w --profile cc-slurm
Building DAG of jobs...
InputFunctionException in line 48 of /scratch/jdekrake/hippdev/hippunfold/hippunfold/workflow/rules/nnunet.smk:
Error:
ValueError: modality not supported for nnunet!
Wildcards:
subject=05
modality=segT2w
hemi=Lflip
Traceback:
File "/scratch/jdekrake/hippdev/hippunfold/hippunfold/workflow/rules/nnunet.smk", line 10, in get_nnunet_input
Traceback (most recent call last):
File "/scratch/jdekrake/hippdev/venv/bin/hippunfold", line 33, in <module>
sys.exit(load_entry_point('hippunfold', 'console_scripts', 'hippunfold')())
File "/scratch/jdekrake/hippdev/hippunfold/hippunfold/run.py", line 14, in main
app.run_snakemake()
File "/scratch/jdekrake/hippdev/venv/lib/python3.7/site-packages/snakebids/app.py", line 209, in run_snakemake
run(snakemake_cmd)
File "/scratch/jdekrake/hippdev/venv/lib/python3.7/site-packages/snakebids/app.py", line 41, in run
raise Exception("Non zero return code: %d"%process.returncode)
Exception: Non zero return code: 1
nnunet does not need to be run since I'm trying to specify a manually labelled hippocampus. Any ideas?
Yeah I haven't kept the manual seg workflows going through these updates -- but should be easily fixed I think -- will fix it in your #29 PR
Sounds great, thanks. I'll add the Zenodo model files to that PR too, so don't merge yet!
I noticed another bug in one of your edits on that branch.. I think to keep things clean we should have a different branch for different features, I'll make another branch for the manual seg fix and another issue too (since this is no longer a cpu/gpu issue)..
Closing this and moving the manual seg discussion to open issue #11
I notice the cpu_inference branch is still open, but commits were also merged to the master branch. Not sure if that was the intended behaviour, but for now this this seems to be breaking the master branch. The same command works perfectly fine with the
--use_gpu
flag added.Command:
Relevant info from slurm log file: