Closed YYShi12 closed 4 years ago
Hi, I can't understand why the feature embedding is added one dimension whose value is 1 when obtaining the pseudo label. Would you please explain it? Thanks
This operation is done to be consistent with the last layer with bias=True.
Thanks. I've understood this operation.
But I can't understand another operation, all_fea = (all_fea.t() / torch.norm(all_fea, p=2, dim=1)).t() . Why feature embeddings are divided by their L2 norm?
Thanks. I've understood this operation. But I can't understand another operation, all_fea = (all_fea.t() / torch.norm(all_fea, p=2, dim=1)).t() . Why feature embeddings are divided by their L2 norm?
We use the cosine distance to measure the similarity. For simplicity, we first l2-normalize the feature vector.
Thanks a lot for your prompt reply.
Hi, I can't understand why the feature embedding is added one dimension whose value is 1 when obtaining the pseudo label.
Would you please explain it?
Thanks