leonsick / depthg

Official implementation of the CVPR 2024 paper "Unsupervised Semantic Segmentation Through Depth-Guided Feature Correlation and Sampling"
10 stars 0 forks source link

Cityscapes Numbers #2

Open olvrhhn opened 1 month ago

olvrhhn commented 1 month ago

Dear Authors,

Thank you for your great work and for providing the code. Unfortunately, we were unable to fully reproduce the reported results on the Cityscapes dataset. We followed the provided code and instructions, but the numbers don't match what's in the paper.

When evaluating the provided checkpoint, we get: Acc: 81.41, mIoU: 22.44

When retraining the model, we get: Acc: 81.24, mIoU: 22.41

Whereas the paper reports: Acc: 81.6, mIoU: 23.1

While the downstream metric performance in terms of pixel accuracy is close to the numbers reported in the paper, the mIoU is more off. Is there something we could have missed? Do you have any idea where this issue might be rooted?

Additionally, we had to fix a few minor issues to run the code as provided in the repo:

external/depthg/generate_depth.py overrides the dataset as is. One needs to replace line 240 with path = os.path.join(folder_path, filename + '.png') and depth.save(path).
external/depthg/src/data.py needs to be adjusted in line 488 with depth_path = join(self.depth_folder_path, subfolder, filename + ".png"), or one can add "zoedepth" to the generate depth script.
external/depthg/src/precompute_knns.py requires setting res to 224 in line 49.

Everything else involves individual adjustments in the config files, which are documented in the paper and/or the repo. I hope this helps to make the code a little easier to use.

Thank you, and best regards, Oliver

leonsick commented 1 week ago

Hi Oliver, Thanks for your question and sorry for the late response. To generate your results, did you run the eval_segmentation.py script? This turns on the CRF and increases the eval image size to 320x320 and using it is important to generate the desired results.

I added a screenshot from my Weights & Biases to show the exact results we obtained for Cityscapes using the eval_segmentation.py script. I will double check if I provided you with the correct checkpoint.

One thing that we noticed during the project was that STEGOs performance can vary if something with the dependency versions or the datasets is off. Since our codebase builds upon STEGO, these could be issues that we inherited. I would advise to make sure all requirements are correctly installed and the dataset was set up properly.

One general issue, not related to this project, can be that different GPUs can produce slightly different results. We used a 2080Ti.

Hope this helps and I'll get back to you on if the checkpoint was the correct one. Also, thanks for the other suggestions, I'll make sure to incorporate them.

Update: I just ran the script again and was able to reproduce the Cityscapes results. I uploaded the checkpoint linked in the repository, just to make sure you now have the same.

Best, Leon

Bildschirmfoto 2024-09-18 um 12 49 53