LeslieTrue / CPP

This is the official implementation for Image Clustering via the Principle of Rate Reduction in the Age of Pretrained Models.
22 stars 3 forks source link

"logits = z" in main_efficient.py #2

Closed mengxianghan123 closed 10 months ago

mengxianghan123 commented 11 months ago

Thanks for your GREAT WORK!!

But it seems that when training ImageNet, the cluster head's output is never used. In main_efficient.py,

with autocast(enabled=True):
                z, logits = model(x)
                logits = z
                self_coeff = (logits @ logits.T).abs().unsqueeze(0)

Could you please offer some explanations? Thanks a lot

LeslieTrue commented 10 months ago

Thanks for your question and sorry for the late reply. It's pure engineering evidence. We found that sharing the feature head and cluster head is effective in alleviating the collapse of training on large-scale datasets. However, this also represents that the current training strategy is suboptimal. If you are interested in pushing the performance, I would suggest:

mengxianghan123 commented 10 months ago

OK! Thanks for your helpful suggestions!