facebookresearch / swav

PyTorch implementation of SwAV https//arxiv.org/abs/2006.09882
Other
1.99k stars 280 forks source link

update of prototypes #48

Closed RGring closed 3 years ago

RGring commented 3 years ago

Hi Mathilde, In your swav paper, I understand that the backbone as well as the prototypes are updated.

Therefore, I was wondering why you call embeddings.detach() (https://github.com/facebookresearch/swav/blob/master/main_swav.py#L291) in your script. I thought when detaching a tensor, no gradient will be back-propagated along this variable.

Thanks in advance for your help!

mathildecaron31 commented 3 years ago

Hi @RGring

The embedding (B x 128) tensor corresponds to normalized vectors in the feature space for all views of the batch. This tensor is only used to fill in the queue and is therefore not used for gradient computation.

Hope that helps

RGring commented 3 years ago

True my bad. I was confusing the order of both!