frankgt commented 6 months ago

ppq是一个很棒的框架，非常系统的考虑到了模型量化落地的方方面面，非常值得学习。尝试了下量化效果的确不错，但是有一个问题，目前ppq能否支持权重和激活的bit位宽不同的量化呢？比如a16w8，即激活16bit，权重8bit。初步看了下相关的代码(ppq/executor/torch.py, L: 515)，目前似乎权重和激活是一起处理的，没有进行区分。 `

if operation is an QuantableOperation, we have to quant its inputs and outputs at first.

            if isinstance(operation, QuantableOperation):
                input_configs = [_ for _ in operation.config.input_quantization_config]
                inputs = [self.quantize_function(input, config) for input, config in zip(inputs, input_configs)]

`

ZhangZhiPku commented 4 months ago

这个是可以的

wangguoqing129 commented 3 months ago

志佬，请问量化为不同位宽有示例代码吗？ @ZhangZhiPku

OpenPPL / ppq

能否支持权重和激活位宽不同的量化？ #559

if operation is an QuantableOperation, we have to quant its inputs and outputs at first.