About min_max_quantize function？

xuyaojian123 commented 11 months ago

Hello，i am come again. What's does the function of min_max_quantize.

daniel03c1 commented 10 months ago

Sorry for the late reply. The function quantizes inputs based on the max and min of inputs and the target bit precision.

xuyaojian123 commented 10 months ago

The function quantizes inputs based on the max and min

Thanks so much for your reply, I still don't understand this function.

a = tensor([0.9372, 2.3312, 3.2122, 0.5491, 5.2983, 1.5295, 0.8926, 2.7871, 3.0447,4.5377]) b = tensor([0.9178, 2.3363, 3.2123, 0.5423, 5.2983, 1.5436, 0.8761, 2.7952, 3.0455,4.5473])

So min_max_quantize function return results b is for what? I don't know why you turned a into b. Can you explain it in detail? Sorry, I have very little knowledge of quantization.

xuyaojian123 commented 10 months ago

Sorry for the late reply. The function quantizes inputs based on the max and min of inputs and the target bit precision.

Looking forward to your reply

daniel03c1 commented 10 months ago

Sorry for being late. You can think of the function as follows: First, the function finds the min and max of the inputs. Given the number of bits, in your example eight, the function finds the equally spaced 2**n_bit numbers between the min and max values. Since you use 8 as the target number of bits, the function will use 256 numbers. These can be deemed quantized numbers. And then, for each input number, the function outputs the nearest quantized number.

As this is only one of the quantization methods, you can easily try other quantization methods. I hope my explanation helps. If you have further questions, feel free to ask.

xuyaojian123 commented 10 months ago

The function quantizes inputs based on the max and min

Thanks so much for your reply, I still don't understand this function.

a = tensor([0.9372, 2.3312, 3.2122, 0.5491, 5.2983, 1.5295, 0.8926, 2.7871, 3.0447,4.5377]) b = tensor([0.9178, 2.3363, 3.2123, 0.5423, 5.2983, 1.5436, 0.8761, 2.7952, 3.0455,4.5473])

So min_max_quantize function return results b is for what? I don't know why you turned a into b. Can you explain it in detail? Sorry, I have very little knowledge of quantization.

Thanks your reply. The purpose of quantization is to reduce model size. But when converting variable a into variable b, they are both of torch.float32 type. Where it reflect the effect of reducing model size.

daniel03c1 commented 10 months ago

Yes. It does return 32-bit floating points. But since they are quantized, which means there are only 2**n_bit numbers possible, they can be encoded into n_bit representations. That is, even though the code uses floating points, we only need n_bits for each parameter for representation, as long as we know the min and max values and the number of bits. In addition, using the min and max values and the number of bits, you can always restore n_bit representations back to 32-bit floating points.

xuyaojian123 commented 10 months ago

Thanks, I understand.

daniel03c1 / masked_wavelet_nerf

About min_max_quantize function？ #14