I'm wondering if there are some mistakes in this code. This function is to get the inital value about Scalebeta of activation_quantizer. Why need to get the gradiant about weight, and use it to update activation_quantizer's parameters. Is it more appropriate to change to weight = child.input ?
Looking forward to your reply.
I'm wondering if there are some mistakes in this code. This function is to get the inital value about Scalebeta of activation_quantizer. Why need to get the gradiant about weight, and use it to update activation_quantizer's parameters. Is it more appropriate to change to
weight = child.input
? Looking forward to your reply.