alibaba / TinyNeuralNetwork

TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
MIT License
750 stars 115 forks source link

Quantization in qnnpack failed to create QNNPACK Average Pooling operator #46

Closed okt-wang closed 2 years ago

okt-wang commented 2 years ago

Hi developer, Thank your great work. I want to use QATQuantizer to quantize my model, but in converter, a error appears:

RuntimeError: [enforce fail at q_avgpool.cpp:369] createStatus == pytorch_qnnp_status_success. failed to create QNNPACK Average Pooling operator

I'm using Pytorch v1.10. Is this a QNNPACK issue? I try to use fbgemm and it works. Thank you!

Below is whole call stack:

     18     converter = TFLiteConverter(qat_model, dummy_input, tflite_path='tflite_model/qat_model.tflite')
---> 19     converter.convert()

/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+6bdac3b8f7d9fd53d7ef4c07920414f46d6e2e62-py3.6.egg/tinynn/converter/base.py in convert(self)
    332         """
    333         self.init_input_transpose()
--> 334         self.init_jit_graph()
    335         self.init_lowered_module()
    336         self.init_common_graph()

/usr/local/lib/python3.6/dist-packages/TinyNeuralNetwork-0.1.0+6bdac3b8f7d9fd53d7ef4c07920414f46d6e2e62-py3.6.egg/tinynn/converter/base.py in init_jit_graph(self)
    120 
    121             with torch.no_grad():
--> 122                 script = torch.jit.trace(self.model, self.dummy_input)
    123 
    124                 # Remove reference to original model to save memory

/usr/local/lib/python3.6/dist-packages/torch/jit/_trace.py in trace(func, example_inputs, optimize, check_trace, check_inputs, check_tolerance, strict, _force_outplace, _module_class, _compilation_unit)
    748             strict,
    749             _force_outplace,
--> 750             _module_class,
    751         )
    752 

/usr/local/lib/python3.6/dist-packages/torch/jit/_trace.py in trace_module(mod, inputs, optimize, check_trace, check_inputs, check_tolerance, strict, _force_outplace, _module_class, _compilation_unit)
    963                 strict,
    964                 _force_outplace,
--> 965                 argument_names,
    966             )
    967             check_trace_method = module._c._get_method(method_name)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
   1088                 recording_scopes = False
   1089         try:
-> 1090             result = self.forward(*input, **kwargs)
   1091         finally:
   1092             if recording_scopes:

/home/okt/aidea-semantic/semantic-segmentation/out/ddrnet_qat.py in forward(self, input_1)
    376         spp_process3_1 = self.spp_process3_1(spp_process3_0)
    377         spp_process3_2 = self.spp_process3_2(spp_process3_1)
--> 378         spp_scale4_0 = self.spp_scale4_0(add_18)
    379         spp_scale4_1 = self.spp_scale4_1(spp_scale4_0)
    380         spp_scale4_2 = self.spp_scale4_2(spp_scale4_1)

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1100         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1101                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1102             return forward_call(*input, **kwargs)
   1103         # Do not call functions when jit is used
   1104         full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py in _slow_forward(self, *input, **kwargs)
   1088                 recording_scopes = False
   1089         try:
-> 1090             result = self.forward(*input, **kwargs)
   1091         finally:
   1092             if recording_scopes:

/usr/local/lib/python3.6/dist-packages/torch/nn/modules/pooling.py in forward(self, input)
    615     def forward(self, input: Tensor) -> Tensor:
    616         return F.avg_pool2d(input, self.kernel_size, self.stride,
--> 617                             self.padding, self.ceil_mode, self.count_include_pad, self.divisor_override)
    618 
    619 

RuntimeError: [enforce fail at q_avgpool.cpp:369] createStatus == pytorch_qnnp_status_success. failed to create QNNPACK Average Pooling operator
peterjc123 commented 2 years ago

Would you please share the module definition of the object self.spp_scale4_0? It would be better to know the input shape of it as well.

okt-wang commented 2 years ago

It is torch's average pooling layer, and my input shape is dummy_input_0 = torch.ones((1, 3, 720, 1280), dtype=torch.float32)

self.spp_scale4_0 = torch.nn.AvgPool2d(kernel_size=1, stride=1, padding=0)
peterjc123 commented 2 years ago

@Ouskit Actually, I need the input shape of the tensor (add_18) that is fed into this AvgPool2d layer.

peterjc123 commented 2 years ago

BTW, here is the code for the checks while performing avg_pool2d in QNNPACK.

peterjc123 commented 2 years ago

@Ouskit I get the message Error in QNNPACK: failed to create average pooling with 1 pooling element: 1x1 pooling is meaningless when invoking the following code. So it seems you can just comment it out in ddrnet_qat.py and pass config={'force_overwrite': False} to the quantizer.

import torch
from tinynn.converter import TFLiteConverter
from tinynn.graph.quantization.quantizer import QATQuantizer

class Model(torch.nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.spp_scale4_0 = torch.nn.AvgPool2d(kernel_size=1, stride=1, padding=0)

    def forward(self, x):
        return self.spp_scale4_0(x)

def main():
    model = Model()
    model.eval()
    dummy_input = torch.ones(1, 3, 720, 1280)

    quantizer = QATQuantizer(model, dummy_input, work_dir='out')
    qat_model = quantizer.quantize()

    qat_model(dummy_input)

      with torch.no_grad():
          qat_model.eval()
          qat_model.cpu()

          qat_model = torch.quantization.convert(qat_model)

          torch.backends.quantized.engine = quantizer.backend

          converter = TFLiteConverter(qat_model, dummy_input, tflite_path='out/qat_model.tflite')
          converter.convert()

if __name__ == '__main__':
    main()
peterjc123 commented 2 years ago

@Ouskit With https://github.com/alibaba/TinyNeuralNetwork/commit/7d2a2098cf56cd06df8362b76f3e28cd622f8eaf, the pooling nodes with kernel size=1 will be rewritten to slices automatically. Please have a try.

okt-wang commented 2 years ago

@Ouskit With 7d2a209, the pooling nodes with kernel size=1 will be rewritten to slices automatically. Please have a try.

This perfectly works! Thanks!^^