mlcommons / GaNDLF

A generalizable application framework for segmentation, regression, and classification using PyTorch
https://gandlf.org
Apache License 2.0
159 stars 79 forks source link

Patch divisibility logic fails when unique patch size value is same as check value #522

Closed smcch closed 1 year ago

smcch commented 1 year ago

GaNDLF Version 0.0.15

Desktop (please complete the following information):

How did you install GaNDLF

git clone https://github.com/mlcommons/GaNDLF.git cd GaNDLF conda create -n venv_gandlf python=3.8 -y conda activate venv_gandlf pip install torch==1.11.0+cu113 torchvision==0.12.0+cu113 torchaudio==0.11.0 --extra-index-url https://download.pytorch.org/whl/cu113 pip install -e . python ./gandlf_verifyInstall

Dataset description

Radiology MRI 3D 5 sequences 3 labels

Describe your question/problem

I'm trying to run a segmentation task but I get the following error when starting the process:

Constructing queue for validation data: 100%|██████████████████████████████████████ ███████| 8/8 [00:00<00:00, 42.54it/s] Traceback (most recent call last): File "gandlf_run", line 100, in main_run( File "C:*****\GANDLF\cli\main_run.py", line 86, in main_run TrainingManager( File "C:*****\GANDLF\training_manager.py", line 254, in TrainingManager training_loop( File "C:*****\GANDLF\compute\training_loop.py", line 265, in training_loop ) = create_pytorch_objects(params, training_data, validation_data, device) File "C:*****\GANDLF\compute\generic.py", line 68, in create_pytorch_objects model = get_model(parameters) File "C:*****\GANDLF\models__init__.py", line 110, in get_model return global_models_dictparams["model"]["architecture"] File "C:*****\GANDLF\models\unet.py", line 35, in init sys.exit( TypeError: exit expected at most 1 argument, got 2

I follow your user guide and the parameters file "config_segmentation_brats.yaml" edited according to my requirements. Can you help me? Thanks

sarthakpati commented 1 year ago

Thanks for the question, @smcch! Can you provide your configuration file?

smcch commented 1 year ago

model.zip

sarthakpati commented 1 year ago

I think I isolated the problem. Can you please try with patch size of [32,32,32] to see if training begins? I'll put a PR with fix soon.

smcch commented 1 year ago

log.txt

Now I get this error.... Thanks a lot !

sarthakpati commented 1 year ago

Hey, the model should still be generated. Can you check the output folder?

smcch commented 1 year ago

This is my folder structure:

/experiment_1/ data_dir/ model_dir/ output_dir/ model.yaml train.csv

the command line: python gandlf_run -c ./experiment_1/model.yaml -i ./experiment_1/train.csv -m ./experiment_1/model_dir/ -t True -d cuda

sarthakpati commented 1 year ago

What is present in the model_dir and output_dir?

smcch commented 1 year ago

model_dir:

output_dir is empty

sarthakpati commented 1 year ago
  • densenet_best.onnx
  • densenet_best.pth

These are your models, which seem to be generated as expected. Closing this issue, please open a new one in case you are having problems. All the best!