CVMI-Lab / SimGCD

(ICCV 2023) Parametric Classification for Generalized Category Discovery: A Baseline Study
https://arxiv.org/abs/2211.11727
MIT License
85 stars 13 forks source link

About initialized prototypes #1

Closed Luffy03 closed 1 year ago

Luffy03 commented 1 year ago

Thanks for your inspiring work first! I am confused about the randomly initialized prototypes introduced in Eq.(4) of Section 4.2. Would you please provide more details about this part? Without available labels, would the low-quality generated prototype hinder the performance? And where is the code for generating these prototypes?

xwen99 commented 1 year ago

Hi @Luffy03,

Interestingly we found the pseudo labels are not that bad at all. For one thing, there are still labels from the labelled subset. For another, the mean-entropy maximisation regulariser helps calibrate the overall predictions and avoids trivial solutions. Further, the teacher temperature warmup strategy lowers the confidence of the pseudo labels at early stages, thus overcoming the unreliable pseudo labels to some extent. You may check the ablation study table for details of the effect of these modules.

Concerning the implementation of the prototypes, you can check: https://github.com/CVMI-Lab/SimGCD/blob/1765452663e7c52e7f0bd0759543655d85524b94/model/simgcd.py#L40-L43 Which is simply a linear layer without bias, and with weight norm.

Luffy03 commented 1 year ago

Thanks for your kind answer! It seems that the randomly initialized prototypes are formulated by the randomly initialized parameters. Is it correct? The labels are available only for the known categories, right? However, the prototypes are employed to supervise the unlabeled datasets with novel categories. In this case, how can we guarantee the accuracy of pseudo labels?

xwen99 commented 1 year ago

Hi, I suggest that you can get familiar with the related works in deep clustering and unsupervised semantic segmentation, where the discovery of novel categories can be done with even no labels. Our work adopts similar techniques. Simply puts, pseudo-label invariance to random augmentations, avoiding collapsing trivial solutions, and setting a suitable number of clusters are keys to its success.

Luffy03 commented 1 year ago

Hi, I suggest that you can get familiar with the related works in deep clustering and unsupervised semantic segmentation, where the discovery of novel categories can be done with even no labels. Our work adopts similar techniques. Simply puts, pseudo-label invariance to random augmentations, avoiding collapsing trivial solutions, and setting a suitable number of clusters are keys to its success.

Thanks for your kind answer.

kleinzcy commented 1 year ago

Acoording to weight_norm, the direction of prototype is trainable, only the magnitude of prototype is fixed to 1 which ensure the prototype is normalized.