Some questions about reproducing COCO-stuff164k results using SCLIP

wangf3014 / SCLIP

Official implementation of SCLIP: Rethinking Self-Attention for Dense Vision-Language Inference

110 stars 9 forks source link

While attempting to replicate your results on the COCO-STUFF2017 dataset, I encountered some issues that I would greatly appreciate your insight on. Firstly, I observed a significantly lower mIoU of only 0.07 in my experiments, which is quite different from the results reported in your paper. I have double-checked my implementation and followed the instructions provided, but the performance remains poor. Could you please share any insights or suggestions that might help me improve the performance?

Additionally, I noticed that when using the “--show-dir” parameter in the evaluation file, the results are not being saved in the specified directory. I have tried troubleshooting this issue, but I am unable to resolve it. Have you encountered this problem in the past, and if so, how did you address it?

Hi, 0.07 mIoU indicates a failure of configuration. Did you modify the cfg_coco_stuff164k.py or cls_coco_stuff.txt files? You can show me your configuration files if the issue remains not addressed. That will be helpful to let me know the problem.

And note that for datasets without a background class, you should not set prob_thd. This parameter means the pixels with probabilities lower than prob_thd will be classified as background.

--show-dir is for visualization results and I have disabled it in this repo. If you would like to see visualizations, execute "trigger_visualization_hook(cfg, args)" before building the runner (i.e., add it in the line 61 in eval.py) and the qualitative results will be saved in "show-dir". This will make your inference slower.

The results and logs are saved in "work-dir". By default is "./work_logs/"

wangf3014 / SCLIP

Some questions about reproducing COCO-stuff164k results using SCLIP #2