Reproducing the ViT-Base results on Cocostuff27

MY-LIU100101 commented 2 years ago

Thank you so much for your excellent and inspiring work!!!

I could reproduce the exciting performance using your pre-trained model. However, I failed to reproduce the performances by re-training your models, using the latest code. Could you please help me to find out if I did something wrong?

What I did is as follows:

1. Changes on the original codes: (I think they will not affect performances)

1.1 To avoid core dump during training, replace "import matplotlib.pyplot as plt" by:

    import matplotlib
    matplotlib.use('Agg')
    from matplotlib import pyplot as plt

1.2 In "eval_segmentation.py", changing the multiprocessing Pool for CRF to single processing. Since the program will stuck for some unknown reasons on my computer.

2. Reproducing the cocostuff27 using VIT small five crop. (I could get similar performances, Thank you so much, It is a great work!!!)

2.1. In "train_config.yml", using "vit_small" model and hyperparameters under "Cocostuff27 vit small 1/31/22". 2.2. Run "crop_datasets.py" -> Change "dataset_names" to ["cocostuff27"] -> Get cropped dataset. 2.3. Run "precompute_knns.py" -> Change "dataset_names" to ["cocostuff27"] -> Get neighbors. 2.4. Run "train_segmentation.py" get: 2.5. Run "eval_segmentation.py" after changing "eval_config.yml" -> Change the "model_paths" to correct ckpt, and change the "run_picie" to False. I get:

{'final/linear/mIoU': 38.03836703300476, 'final/linear/Accuracy': 74.07384514808655, 'final/cluster/mIoU': 23.345062136650085, 'final/cluster/Accuracy': 46.15441858768463}
{'final/linear/mIoU': 37.097787857055664, 'final/linear/Accuracy': 73.81566762924194, 'final/cluster/mIoU': 23.430554568767548, 'final/cluster/Accuracy': 47.467902302742004}

**3. Reproducing the cocostuff27 using VIT base five crop. (I failed)**

Based on the above changes:

3.1. Using "vit_base" model and hyperparameters under "Cocostuff27 10/3 vit_base", in "train_config.yml". 3.2. Run "precompute_knns.py" -> Change "dataset_names" to ["cocostuff27"] -> Get neighbors. 3.3 Run "train_segmentation.py" get:

Attempted to log scalar metric test/linear/mIoU:
33.971819281578064
Attempted to log scalar metric test/linear/Accuracy:
72.1398413181305
Attempted to log scalar metric test/cluster/mIoU:
19.022752344608307
Attempted to log scalar metric test/cluster/Accuracy:
43.240439891815186

3.4. Run "eval_segmentation.py" after changing "eval_config.yml" -> Change the "model_paths" to correct ckpt, and change the "run_picie" to False. I get:

{'final/linear/mIoU': 36.794888973236084, 'final/linear/Accuracy': 72.87865877151489, 'final/cluster/mIoU': 20.34243792295456, 'final/cluster/Accuracy': 47.14389741420746}
{'final/linear/mIoU': 35.19098460674286, 'final/linear/Accuracy': 74.07739758491516, 'final/cluster/mIoU': 21.089258790016174, 'final/cluster/Accuracy': 48.37090075016022}

3.5 I also tried different random seeds for training : seed = 1

{'final/linear/mIoU': 37.76582181453705, 'final/linear/Accuracy': 74.69892501831055, 'final/cluster/mIoU': 19.050808250904083, 'final/cluster/Accuracy': 42.96903908252716}
{'final/linear/mIoU': 36.56575679779053, 'final/linear/Accuracy': 74.39007759094238, 'final/cluster/mIoU': 18.84729117155075, 'final/cluster/Accuracy': 44.03853118419647}

seed = 2

{'final/linear/mIoU': 38.32502365112305, 'final/linear/Accuracy': 75.02520084381104, 'final/cluster/mIoU': 19.779230654239655, 'final/cluster/Accuracy': 46.16449475288391}
{'final/linear/mIoU': 38.56886327266693, 'final/linear/Accuracy': 74.82376098632812, 'final/cluster/mIoU': 20.10801136493683, 'final/cluster/Accuracy': 51.235431432724}

Could you please help me to find my problems at your convenience? Thank you so much in advance !!!

deta5 commented 2 years ago

Thank you so much for your excellent and inspiring work!!!

I could reproduce the exciting performance using your pre-trained model. However, I failed to reproduce the performances by re-training your models, using the latest code. Could you please help me to find out if I did something wrong?

What I did is as follows:

1. Changes on the original codes: (I think they will not affect performances)

1.1 To avoid core dump during training, replace "import matplotlib.pyplot as plt" by:
 import matplotlib
 matplotlib.use('Agg')
 from matplotlib import pyplot as plt
1.2 In "eval_segmentation.py", changing the multiprocessing Pool for CRF to single processing. Since the program will stuck for some unknown reasons on my computer.

2. Reproducing the cocostuff27 using VIT small five crop. (I could get similar performances, Thank you so much, It is a great work!!!)

2.1. In "train_config.yml", using "vit_small" model and hyperparameters under "Cocostuff27 vit small 1/31/22". 2.2. Run "crop_datasets.py" -> Change "dataset_names" to ["cocostuff27"] -> Get cropped dataset. 2.3. Run "precompute_knns.py" -> Change "dataset_names" to ["cocostuff27"] -> Get neighbors. 2.4. Run "train_segmentation.py" get: 2.5. Run "eval_segmentation.py" after changing "eval_config.yml" -> Change the "model_paths" to correct ckpt, and change the "run_picie" to False. I get:
{'final/linear/mIoU': 38.03836703300476, 'final/linear/Accuracy': 74.07384514808655, 'final/cluster/mIoU': 23.345062136650085, 'final/cluster/Accuracy': 46.15441858768463}
{'final/linear/mIoU': 37.097787857055664, 'final/linear/Accuracy': 73.81566762924194, 'final/cluster/mIoU': 23.430554568767548, 'final/cluster/Accuracy': 47.467902302742004}
3. Reproducing the cocostuff27 using VIT base five crop. (I failed)

Based on the above changes:

3.1. Using "vit_base" model and hyperparameters under "Cocostuff27 10/3 vit_base", in "train_config.yml". 3.2. Run "precompute_knns.py" -> Change "dataset_names" to ["cocostuff27"] -> Get neighbors. 3.3 Run "train_segmentation.py" get:
Attempted to log scalar metric test/linear/mIoU:
33.971819281578064
Attempted to log scalar metric test/linear/Accuracy:
72.1398413181305
Attempted to log scalar metric test/cluster/mIoU:
19.022752344608307
Attempted to log scalar metric test/cluster/Accuracy:
43.240439891815186
3.4. Run "eval_segmentation.py" after changing "eval_config.yml" -> Change the "model_paths" to correct ckpt, and change the "run_picie" to False. I get:
{'final/linear/mIoU': 36.794888973236084, 'final/linear/Accuracy': 72.87865877151489, 'final/cluster/mIoU': 20.34243792295456, 'final/cluster/Accuracy': 47.14389741420746}
{'final/linear/mIoU': 35.19098460674286, 'final/linear/Accuracy': 74.07739758491516, 'final/cluster/mIoU': 21.089258790016174, 'final/cluster/Accuracy': 48.37090075016022}
3.5 I also tried different random seeds for training : seed = 1
{'final/linear/mIoU': 37.76582181453705, 'final/linear/Accuracy': 74.69892501831055, 'final/cluster/mIoU': 19.050808250904083, 'final/cluster/Accuracy': 42.96903908252716}
{'final/linear/mIoU': 36.56575679779053, 'final/linear/Accuracy': 74.39007759094238, 'final/cluster/mIoU': 18.84729117155075, 'final/cluster/Accuracy': 44.03853118419647}
seed = 2
{'final/linear/mIoU': 38.32502365112305, 'final/linear/Accuracy': 75.02520084381104, 'final/cluster/mIoU': 19.779230654239655, 'final/cluster/Accuracy': 46.16449475288391}
{'final/linear/mIoU': 38.56886327266693, 'final/linear/Accuracy': 74.82376098632812, 'final/cluster/mIoU': 20.10801136493683, 'final/cluster/Accuracy': 51.235431432724}
Could you please help me to find my problems at your convenience? Thank you so much in advance !!!

hello, I also can't reproduce the results using vit-base, have you solved this problem?

MY-LIU100101 commented 2 years ago

Sorry, I have not reproduced it yet. Would you like to share the performances you got, or some details, so that we may find out whether we missed something?

deta5 commented 2 years ago

Sorry, I have not reproduced it yet. Would you like to share the performances you got, or some details, so that we may find out whether we missed something?

It's my results using vit-base by running "eval segmentation":{'final/linear/mIoU': 40.18509089946747, 'final/linear/Accuracy': 75.30452013015747, 'final/cluster/mIoU': 27.281320095062256, 'final/cluster/Accuracy': 56.38043284416199}

but There is a certain gap with the results in the paper and I can't reproduce the results well using vit-small before

mhamilton723 commented 1 year ago

Hey @MY-LIU100101 did you change the batch size at all? Also when you run the results on the pre-trained model I provided are you able to reproduce it? Theres some natural variance in these numbers and you might be also hitting that. I will try to release some new training procedures soon to try to make this a bit less flaky

MY-LIU100101 commented 1 year ago

Yes, I could reproduce your results using your pre-trained model. But, the performance of STEGO trained from scratch seems slightly lower than that of the provided model. Anyway, STEGO is an excellent and inspiring work. Thank you very much for your hard work.

axkoenig commented 1 year ago

Hi folks, congrats on the great paper! To add to the discussion, I'd like to share that we are publishing a follow-up study on STEGO in CVPR 23 Workshops, which also looks at some reproducibility aspects! :) Cheers, Alex

mhamilton723 / STEGO