yhhhli / APoT_Quantization

PyTorch implementation for the APoT quantization (ICLR 2020)
258 stars 51 forks source link

about uniform quantization #15

Open lxy21319 opened 3 years ago

lxy21319 commented 3 years ago

hello, here are the two function of uniform quantization from CIFAR10 and Imagenet: def uniform_quant(x, b=3): xdiv = x.mul(2 ** b - 1) xhard = xdiv.round().div(2 ** b - 1) return xhard def uniform_quantization(tensor, alpha, bit, is_weight=True, grad_scale=None): if grad_scale: alpha = gradient_scale(alpha, grad_scale) data = tensor / alpha if is_weight: data = data.clamp(-1, 1) data = data * (2 ** (bit - 1) - 1) data_q = (data.round() - data).detach() + data data_q = data_q / (2 ** (bit - 1) - 1) * alpha else: data = data.clamp(0, 1) data = data * (2 ** bit - 1) data_q = (data.round() - data).detach() + data data_q = data_q / (2 ** (bit - 1) - 1) * alpha return data_q

I want to know why the input first need to multiply (2 bit - 1)?? multiply (2 bit - 1) then divide (2 ** b - 1) ,is it the number not changed? thanks

lxy21319 commented 3 years ago

it means de-quantization?

yhhhli commented 3 years ago

Hi,

Thanks for your question. We define alpha as the clipping threshold for the quantization. Not the step size between two adjacent quantization levels. So we expect the tensor divide by alpha will lie in [0, 1], and then scale them to [0, 2**b-1] so that round() function can compute the corresponding integers. After, rounding-to-nearest, we scale the integers back to original range, also known as the de-quantization as you mentioned.