Why all_fea = torch.cat((all_fea, torch.ones(all_fea.size(0), 1)), 1) ?

tim-learn / SHOT

code released for our ICML 2020 paper "Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation"

MIT License

437 stars 78 forks source link

Why all_fea = torch.cat((all_fea, torch.ones(all_fea.size(0), 1)), 1) ? #13

Closed yyaaa1 closed 3 years ago

yyaaa1 commented 3 years ago

Hello, after read the previous answers of this question, I am still confused about this operation. Why should we explicitly add 1 as bias? As far as I know, the linear transformation with bias on will not change the size of our feature, so I feel it is unnecessary to add a bias manually.

Would you please further explain this for me?

Thx

tim-learn commented 3 years ago

@yyaaa1 for the last classifier[w, b], if we add 1, then the new features will be close to the [w_k, b_k] via the cosine distance, which is adopted in this paper.

siaimes commented 3 years ago

@tim-learn

nn.init.zeros_(m.bias)

I see that bias is initialized to 0 in your code, which is equivalent to no bias, isn’t it?

tim-learn commented 3 years ago

@tim-learn

nn.init.zeros_(m.bias)

I see that bias is initialized to 0 in your code, which is equivalent to no bias, isn’t it?

I think the bias would have a non-zero value after learning, maybe you can check that value.