There is an Additional Note (1) section where it says: " If all the weights are initialized to 0, only the scale of the weight vector, not the direction."
Seems there is some missing meaning in that sentence. Was wondering if you could correct it please. Thank you very much!
Thanks for the note. It should say "If all the weights are initialized to 0, the learning rate parameter eta affects only the scale of the weight vector but not its direction." Just fixed it.
There is an Additional Note (1) section where it says: " If all the weights are initialized to 0, only the scale of the weight vector, not the direction."
Seems there is some missing meaning in that sentence. Was wondering if you could correct it please. Thank you very much!