mhamilton723 / STEGO

Unsupervised Semantic Segmentation by Distilling Feature Correspondences
MIT License
711 stars 142 forks source link

I cannot reproduce vit-small@cocostuff27 using the default config #63

Open rayleizhu opened 1 year ago

rayleizhu commented 1 year ago

Hi, thanks for open-sourcing this elegant codebase.

As a starting point, I try to reproduce the results in your paper. However, using vit-small@cocostuff27, I can only achieve 18.13/30.59 mIoU (no CRF) using cluster/linear probe respectively, while the reported results in the paper are 24.0/38.4 mIoU (no CRF). I do not change anything in the code, and only change the 6 hyperparameters to the numbers reported in the paper.

How can I reproduce the performance mentioned in the paper?

mhamilton723 commented 1 year ago

If you load our pretrained model you should able to reproduce those numbers. The pre-trained model also has the specific values of the hyperparameters used so that you can verify. Also the CRF adds a few points of perf in there too