Results of the run on 《A Simple and Robust Framework for Cross-Modality Medical Image Segmentation applied to Vision Transformers》

NLPgameplayer commented 4 months ago

Results of the run on 《A Simple and Robust Framework for Cross-Modality Medical Image Segmentation applied to Vision Transformers》 I have read your article carefully and thank you for your contribution to the field of medical segmentation. When I tried to reproduce your code, I found that the results of the run were very different from the paper results. It is as follows. This result is obtained by executing (python train.py --model_name=swin_unetr --out_channels=9 --feature_size=48 --num_heads=3 --accelerator=gpu --devices=1--max_epochs=2500 --encoder_norm_name=instance_cond --vit_norm_name=instance_cond --lr=1e-4 --batch_size=1 --patches_training_sample=1 wandb_mode=offline) after preprocessing the data. The data used were the 10 CT and 16 MR data mentioned in the paper. Since it will truncate at the 50th epoch, my guess is that continued training will give better results. So I commented out the early termination code, but the results are still not good. I also trained on unimodal MRI images and the results were only about twenty percent. Is it my training method or that parameter is not adjusted correctly, please help me to correct it, thanks a lot!

matteo-bastico commented 4 months ago

Hello,

Did you try to use the provided pre-trained weights?

Our results are obtained with hyper-parameters tuning running multiple times the following script (it runs 4 trials):

python -u tune.py --num_workers=2 --out_channels=8 --no_include_background --criterion=dice_focal --scheduler=warmup_cosine --entity=phd-matteo --project=MM-WHS --wandb_mode=offline --model_name=swin_unetr --vit_norm_name=instance_cond --encoder_norm_name=instance_cond --timeout=70000 --n_trials=4 --study_name=full-c-swin-unetr_original_fold2 --max_epochs=2500 --check_val_every_n_epoch=50 --batch_size=1 --patches_training_sample=4 --iters_to_accumulate=4 --cycles=0.5 --storage_name=MM-WHS --min_lr=1e-5 --max_lr=1e-3 --data_dirs dataset/MM-WHS dataset/MM-WHS --json_lists MR_fold.json CT_fold2.json --default_root_dir=./experiments/MM-WHS --num_heads=3 --feature_size=48 --downsample=mergingv2 --depth_swin_block 2 --port=23456

You can see here some parameters config that led to the best results

Screenshot from 2024-04-22 16-11-16

You should try to increase the batch_size or patches_training_sample (we run the code in parallel on 4 GPU with 4 patches per batch, so 16 patches per iteration in total) and try different learning rates.

NLPgameplayer commented 4 months ago

It has inspired me a lot and I was admiring your code today. I only have one GPU at the moment and am planning to replicate your work using one GPU to see how the results compare between our models. Top marks for your code being so well organised!

matteo-bastico commented 4 months ago

Thank you so much!

matteo-bastico / MI-Seg

Results of the run on 《A Simple and Robust Framework for Cross-Modality Medical Image Segmentation applied to Vision Transformers》 #5