Closed alex-wy closed 4 years ago
The purpose of the factor is to make the network output range and the training data range more compatible such that the training does not need to deal with this difference in scale. It is an artifact of how we set up the networks. As an alternative we could change the initialization of the network weights and get rid of this factor. Hope this helps!
I have read ur paper. It's really an excellent work. But I'm a little confused about why its output should be divided by "128" in the end.
That's the sentence which mentions it in paper.
Hope u can answer my question. Thx!