Closed huanmei9 closed 2 years ago
It's used to keep the gradient update rating of self.s and self.b, if you drop this parameter, you will find that the update of self.s and self.b become unstable. Self.g keep the update rating same as NN'weight. So it is necessary for convergence.
Thanks for your open source work. There is an initalization variable
self.g
in _lsqqlus_quantizaV1.py:L138, I'm wondering the usage of it. Looking forward to your reply.