Closed CyrilKZ closed 2 years ago
Hi @CyrilKZ
During training, we use the hard
assignment. The soft
yields better results for evaluation.
Thanks for your replay :)
Thanks for your replay :) Hi, I also tested the performance of groupvit on ADE and cityscape, and it is only about 6%, I don't know if I am mistaken. If so, why is it so low?
HI @pzhren In the inference pipeline, we always resize to 448 short side. And due to the patch dividing process, GroupViT may miss segments of size smaller than 16px. On detailed high-res dataset like ADE and cityscapes.some classes are too small for GroupVIT
Hi, thank you for your great work.
I noticed that during the generation of segmentation masks, soft assignment matrices are used instead of hard assigment matrices (from
segmentation/evaluation/group_vit_seg.py, line 166
). Although the product of the soft assignment metrices is converted to one-hot before classifying pixels, it is somewhat different from your paper, which suggests that we should directly multiply hard assigment matrices.In fact, by changing the code in line 166 from
attn_masks = attn_dict['soft']
toattn_masks = attn_dict['hard']
, the demo yields worse segmentation result.Am I misunderstanding the code or missing some implementation details from your paper?