How to get the positive and negative pairs of slots?

CVMI-Lab / SlotCon

(NeurIPS 2022) Self-Supervised Visual Representation Learning with Semantic Grouping

https://wen-xin.info/slotcon/

Apache License 2.0

95 stars 9 forks source link

How to get the positive and negative pairs of slots? #2

Closed dyh127 closed 11 months ago

dyh127 commented 1 year ago

Hi Xin,

Thanks for the great and insightful work.

When I read the code, I am confused by the label generation for contrastive learning of slots. As shown in https://github.com/CVMI-Lab/SlotCon/blob/main/models/slotcon.py#L186, the slots with the same indexes are viewed as positive indexes while I find that these slots are generated by masked pooling from features and indexes maybe not be related to the semantic classes. Maybe I have missed something.

Looking forward to your rely!

xwen99 commented 1 year ago

Hi @dyh127,

Your concern is that for one slot, the feature may also come from pixels that correspond to other slots. Note that although we use soft masks for pooling the pixels into slots, the masks are actually quite sharp (close to one-hot), so the feature of one slot is quite consistent (if not, the similarity/attention value for that pixel would be small). If you find this is an issue, you can try a sharper temperature value (currently 0.07) for the soft masks, or maybe you can try using hard masks directly (gumbel softmax is also needed in this case).