Open lxy21319 opened 3 years ago
it means de-quantization?
Hi,
Thanks for your question. We define alpha
as the clipping threshold for the quantization. Not the step size between two adjacent quantization levels. So we expect the tensor divide by alpha will lie in [0, 1], and then scale them to [0, 2**b-1] so that round()
function can compute the corresponding integers. After, rounding-to-nearest, we scale the integers back to original range, also known as the de-quantization as you mentioned.
hello, here are the two function of uniform quantization from CIFAR10 and Imagenet:
def uniform_quant(x, b=3): xdiv = x.mul(2 ** b - 1) xhard = xdiv.round().div(2 ** b - 1) return xhard
def uniform_quantization(tensor, alpha, bit, is_weight=True, grad_scale=None): if grad_scale: alpha = gradient_scale(alpha, grad_scale) data = tensor / alpha if is_weight: data = data.clamp(-1, 1) data = data * (2 ** (bit - 1) - 1) data_q = (data.round() - data).detach() + data data_q = data_q / (2 ** (bit - 1) - 1) * alpha else: data = data.clamp(0, 1) data = data * (2 ** bit - 1) data_q = (data.round() - data).detach() + data data_q = data_q / (2 ** (bit - 1) - 1) * alpha return data_q
I want to know why the input first need to multiply (2 bit - 1)?? multiply (2 bit - 1) then divide (2 ** b - 1) ,is it the number not changed? thanks