mhamilton723 / STEGO

Unsupervised Semantic Segmentation by Distilling Feature Correspondences
MIT License
711 stars 142 forks source link

Problems of reproducing the Potsdam-3 dataset result #62

Open pitlover opened 1 year ago

pitlover commented 1 year ago

Dear Mark,

According to the paper, it seems that the backbone of the model for Potsdam-3 dataset is vit-base, but there are no specific values for hyperparameters. In addition, train_config.yml has a comment saying "Potsdam vit small 1/31/22", but the reproduced performance is only 61% which is lower than your report.

Can you share... 1) the type of your backbone on Potsdam-3 (vit-base or vit-small) 2) specific hyperparameters (weights and shifts for neg-inter, pos-inter, and pos-intra, respectively) for me, please?

Since Potsdam-3 dataset performance is very important to my work, I earnestly ask for your reply.

Thank you.

mhamilton723 commented 1 year ago

If you load our pretrained model you should able to reproduce those numbers. The pre-trained model also has the specific values of the hyperparameters used so that you can verify. Also the CRF adds a few points of perf in there too

pitlover commented 1 year ago

Hi @mhamilton723 , could you provide a STEGO pretrained model(vit-base) on Cityscapes and Potsdam? Where can I get the weights?