Why all_fea = torch.cat((all_fea, torch.ones(all_fea.size(0), 1)), 1) ?

tim-learn / SHOT

code released for our ICML 2020 paper "Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation"

MIT License

437 stars 77 forks source link

Why all_fea = torch.cat((all_fea, torch.ones(all_fea.size(0), 1)), 1) ? #3

Closed YYShi12 closed 4 years ago

YYShi12 commented 4 years ago

Hi, I can't understand why the feature embedding is added one dimension whose value is 1 when obtaining the pseudo label.

Would you please explain it?

Thanks

tim-learn commented 4 years ago

Hi, I can't understand why the feature embedding is added one dimension whose value is 1 when obtaining the pseudo label. Would you please explain it? Thanks

This operation is done to be consistent with the last layer with bias=True.

YYShi12 commented 4 years ago

Thanks. I've understood this operation.

But I can't understand another operation, all_fea = (all_fea.t() / torch.norm(all_fea, p=2, dim=1)).t() . Why feature embeddings are divided by their L2 norm?

tim-learn commented 4 years ago

Thanks. I've understood this operation. But I can't understand another operation, all_fea = (all_fea.t() / torch.norm(all_fea, p=2, dim=1)).t() . Why feature embeddings are divided by their L2 norm?

We use the cosine distance to measure the similarity. For simplicity, we first l2-normalize the feature vector.

YYShi12 commented 4 years ago

Thanks a lot for your prompt reply.