Closed yyaaa1 closed 3 years ago
@yyaaa1 for the last classifier[w, b], if we add 1, then the new features will be close to the [w_k, b_k] via the cosine distance, which is adopted in this paper.
@tim-learn
I see that bias is initialized to 0 in your code, which is equivalent to no bias, isn’t it?
@tim-learn
I see that bias is initialized to 0 in your code, which is equivalent to no bias, isn’t it?
I think the bias would have a non-zero value after learning, maybe you can check that value.
Hello, after read the previous answers of this question, I am still confused about this operation. Why should we explicitly add 1 as bias? As far as I know, the linear transformation with bias on will not change the size of our feature, so I feel it is unnecessary to add a bias manually.
Would you please further explain this for me?
Thx