Questions about some Implementations

omg777 commented 3 years ago

Hi, Thank you for sharing the nice papers and code. I have some questions about the implementations.

In 'main_swav.py', Is the embedding and output dimension 256 determined from 8-views * batch size (32) ?

    embedding, output = model(inputs) # [256, 128]
    embedding = embedding.detach() # [256, 3000]

Thanks in advance !

mathildecaron31 commented 3 years ago

Hi @omg777

I am sorry for the delay of my reply.

You can apply a argmax on the soft assignment to convert them into hard assignment. For exemple you add q = hard_assign(q) here https://github.com/facebookresearch/swav/blob/101836619ab5bf026f960d3c5a33869a1f1bd629/main_swav.py#L319
```
def hard_assign(q):
    y = torch.argmax(q, dim=1).unsqueeze(1)
    targets.zero_()
    targets.scatter_(1, y, 1)
    return q
```
Exactly. For each instance of the batch you have 8 views, for a batch size of 32 it gives an effective number of 8*32=256 views. If we call x and y two instances and x_1, x_2, ..., x_8 8 views from the instance x then the embedding and output tensors are organized as follow: [x_1, y_1, ..., x_2, y_2, ..., ..., x_8, y_8,]

Hope that helps

mathildecaron31 commented 3 years ago

Closing for no activity. Please reopen if you need further assistance.

facebookresearch / swav