microsoft / nni

An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
https://nni.readthedocs.io
MIT License
14.05k stars 1.81k forks source link

Why don't use the old_bias to replace the bias in class QuantizerModuleWrapper? #3517

Closed Lycan1003 closed 3 years ago

Lycan1003 commented 3 years ago

compressor.py line 467

class QuantizerModuleWrapper(torch.nn.Module):
        if 'weight' in config['quant_types']:
            if not _check_weight(self.module):
                _logger.warning('Module %s does not have parameter "weight"', self.name)
            else:
                self.module.register_parameter('old_weight', torch.nn.Parameter(self.module.weight))
                delattr(self.module, 'weight')
                self.module.register_buffer('weight', self.module.old_weight)

In this class, it uses 'old_weight' as the parameter and 'weight' as the buffer. I wonder to know that why don't do the same operation to the bias? In quantizer.py line

        if hasattr(wrapper.module, 'bias') and wrapper.module.bias is not None:
            bias = wrapper.module.bias.data
            bias_bits = 32
            rmin, rmax = torch.min(bias), torch.max(bias)
            module.scale, module.zero_point = update_quantization_param(bias_bits, rmin, rmax)
            bias = self._quantize(bias_bits, module, bias)
            bias = self._dequantize(module, bias)
            wrapper.module.bias.data = bias

It seems to quantize aware bias directly.

linbinskn commented 3 years ago

The reason why we use old_weight as parameter is that we want to update weight parameter in high precision(fp32) to increase accuracy and speed up training convergence process. Of course, we can handle bias in the same way. However, we think int32 precision is accurate enough for bias and the bias updating on this precision will not influence accuracy and training process.

Lycan1003 commented 3 years ago

The reason why we use old_weight as parameter is that we want to update weight parameter in high precision(fp32) to increase accuracy and speed up training convergence process. Of course, we can handle bias in the same way. However, we think int32 precision is accurate enough for bias and the bias updating on this precision will not influence accuracy and training process.

Thanks for your reply!