himashi92 / VT-UNet

[MICCAI2022] This is an official PyTorch implementation for A Robust Volumetric Transformer for Accurate 3D Tumor Segmentation
MIT License
245 stars 34 forks source link

Issue with training #23

Closed Chadkowski closed 2 years ago

Chadkowski commented 2 years ago

Hi! I am trying to train the vtunet but the process is not starting i nthe background with nohup. Any idea why?

himashi92 commented 2 years ago

Hi! Did you try with sudo?

Chadkowski commented 2 years ago

Hi! yes, I think the directory is hardcoded in one file so I think that might be causing the issue. I'll try running everything from /home/ and let you know how it goes.

Chadkowski commented 2 years ago

having this issue now whenever I try to preprocess data, my environments are set and can be reached with $echo var_name:

vtunet_raw_data_base is not defined and nnU-Net can only be used on data for which preprocessed files are already present on your system. nnU-Net cannot be used for experiment planning and preprocessing like this. If this is not intended, please read documentation/setting_up_paths.md for information on how to set this up properly. vtunet_preprocessed is not defined and nnU-Net can not be used for preprocessing or training. If this is not intended, please read documentation/setting_up_paths.md for information on how to set this up. RESULTS_FOLDER_VTUNET is not defined and nnU-Net cannot be used for training or inference. If this is not intended behavior, please read documentation/setting_up_paths.md for information on how to set this up. Traceback (most recent call last): File "/usr/bin/vtunet_convert_decathlon_task", line 33, in sys.exit(load_entry_point('vtunet', 'console_scripts', 'vtunet_convert_decathlon_task')()) File "/home/VTUNet/vtunet/experiment_planning/vtunet_convert_decathlon_task.py", line 60, in main split_4d(args.i, args.p, args.output_task_id) File "/home/VTUNet/vtunet/experiment_planning/utils.py", line 53, in split_4d output_folder = join(vtunet_rawdata, "Task%03.0d" % overwrite_task_output_id + task_name) File "/usr/lib/python3.8/posixpath.py", line 76, in join a = os.fspath(a) TypeError: expected str, bytes or os.PathLike object, not NoneType`

Chadkowski commented 2 years ago

solved: change the env variables in etc/environment instead of ~./profile bashrc as stated in the readme.

Chadkowski commented 2 years ago

question now remains how to configure for training? Whenever I run it I think I'm overflowing my GPU.. any ideas?

himashi92 commented 2 years ago

Hi, can you reduce the batch size and run it again? Modify batch size in this file: https://github.com/himashi92/VT-UNet/blob/main/VTUNet/vtunet/run/default_configuration.py

Chadkowski commented 2 years ago

Hi, can you reduce the batch size and run it again? Modify batch size in this file: https://github.com/himashi92/VT-UNet/blob/main/VTUNet/vtunet/run/default_configuration.py

should be ok now if I reduce to "batch-size=1".

Thanks!

Chadkowski commented 2 years ago

Question, the pre-determiend "Synapse"-parameters in the default_configuration.py ----- Are these defined for the "synapse multi-organ segmentation dataset"??

@himashi92

himashi92 commented 2 years ago

yes, the code was adapted from nnUNet and nnFormer. We haven't trained VT-UNet on those datasets yet. But you can try those configurations to train VT-UNet on Synapse Multi Organ dataset.