facebookresearch / swav

PyTorch implementation of SwAV https//arxiv.org/abs/2006.09882
Other
1.99k stars 280 forks source link

Empty clusters? #42

Closed daniiki closed 3 years ago

daniiki commented 3 years ago

Hi @mathildecaron31

I trained a network from scratch with my own dataset and wrote some code that sorts images in different folders regarding their cluster assignments. I did this with the following lines of code:

embedding, output = model(inputs)
p = softmax(output / args.temperature)
prediction = p.tolist()
prototyp = []
for i in range(len(prediction)): 
      prototyp.append(np.argmax(prediction[i]))

The problem is that when I save the images in different folders regarding their cluster assignment, some folders remain empty. The number of folders is the same as the number of prototypes. I always thought that the images are equally distributed between the different prototypes. What is the problem? Can you help me?

mathildecaron31 commented 3 years ago

Hi @daniiki,

Have you been training deepclusterv2 or swav ?

For swav, if the batch size is smaller than the number of prototypes it is then possible that some clusters remain unused.

For deepclusterv2, we do not enforce equipartition so it is posible that some clusters are empty. If this is a problem for you, you can add constraints like reassigning the empty clusters during training for example.

daniiki commented 3 years ago

Thanks for the qick response! I'm using swav. I understand that empty clusters exist if the number of prototypes is larger than the batch size, but I thought to solve this problem you introduced the queue? I use the queue like mentioned in your paper after epoch 15.

mathildecaron31 commented 3 years ago

When using the queue, both the queue features and the batch features are assigned together. I am not sure to understand your question.

mathildecaron31 commented 3 years ago

Closing for no activity. Feel free to reopen if needed.

shijianjian commented 2 years ago

For swav, if the batch size is smaller than the number of prototypes it is then possible that some clusters remain unused.

For deepclusterv2, we do not enforce equipartition so it is posible that some clusters are empty. If this is a problem for you, you can add constraints like reassigning the empty clusters during training for example.

I am using SwAV to divide a dataset into two clusters (with 256 batchsize) and I have obtained a very small loss for the clustering, which I would assume it has been properly trained. However, when I tried to perform some cluster analysis using the output tensor, I found they are pretty similar across the whole dataset that all fall into the same cluster (code is as same as above). Am I doing something wrong?