Failed to prepare model for QAT in Pt2e with quanconfig from ai_edge_torch

gurkirt commented 1 day ago

Description of the bug:

I have try following quantization configs.

example_inputs = (torch.randn(1, 1, self.args.img_height, self.args.img_width).to(self.args.device),)
        # self.model.eval()

self.model = capture_pre_autograd_graph(self.model, example_inputs)
if self.args.quantize_config_type == 'xnn':
    from torch.ao.quantization.quantizer.xnnpack_quantizer import XNNPACKQuantizer, get_symmetric_quantization_config
    self.quantizer = XNNPACKQuantizer()
    self.quantizer.set_global(get_symmetric_quantization_config(is_qat=is_qat, is_per_channel=True))
elif self.args.quantize_config_type == 'x86':
    self.quantizer = X86InductorQuantizer()
    self.quantizer.set_global(get_default_x86_inductor_quantization_config(is_qat=is_qat, is_dynamic=False))
elif self.args.quantize_config_type == 'eat':
    from ai_edge_torch.quantize.pt2e_quantizer import get_symmetric_quantization_config
    from ai_edge_torch.quantize.pt2e_quantizer import PT2EQuantizer
    self.quantizer = PT2EQuantizer().set_global(
        get_symmetric_quantization_config(is_per_channel=True, is_dynamic=False, is_qat=is_qat))

with prepare being from

self.model = prepare_qat_pt2e(self.model, self.quantizer)

However in third condition which is quant configs from ai_edge_torch, prepare_qat_pt2e fail but it works in other configs.

This is problem because I want to convert qat model to be in tflite, tflite conversion only works with quantize_config_type == 'eat' but not other

Actual vs expected behavior:

In case of prepare_qat with 'eat' config. I get following error;

self.model = prepare_qat_pt2e(self.model, self.quantizer)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xovis/miniconda3/envs/pymmlib2.4/lib/python3.11/site-packages/torch/ao/quantization/quantize_pt2e.py", line 175, in prepare_qat_pt2e
  quantizer.annotate(model)
File "/home/xovis/miniconda3/envs/pymmlib2.4/lib/python3.11/site-packages/ai_edge_torch/quantize/pt2e_quantizer.py", line 356, in annotate
  model = self._annotate_for_static_quantization_config(model)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xovis/miniconda3/envs/pymmlib2.4/lib/python3.11/site-packages/ai_edge_torch/quantize/pt2e_quantizer.py", line 404, in _annotate_for_static_quantization_config
  self._annotate_all_static_patterns(
File "/home/xovis/miniconda3/envs/pymmlib2.4/lib/python3.11/site-packages/ai_edge_torch/quantize/pt2e_quantizer.py", line 371, in _annotate_all_static_patterns
  OP_TO_ANNOTATOR[op](model, quantization_config, filter_fn)
File "/home/xovis/miniconda3/envs/pymmlib2.4/lib/python3.11/site-packages/ai_edge_torch/quantize/pt2e_quantizer_utils.py", line 433, in _annotate_conv_bn_relu
  return _do_annotate_conv_bn(gm, quantization_config, filter_fn, has_relu=True)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xovis/miniconda3/envs/pymmlib2.4/lib/python3.11/site-packages/ai_edge_torch/quantize/pt2e_quantizer_utils.py", line 489, in _do_annotate_conv_bn
  pattern = _get_aten_graph_module_for_pattern(pattern, example_inputs, is_cuda)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xovis/miniconda3/envs/pymmlib2.4/lib/python3.11/site-packages/torch/ao/quantization/pt2e/utils.py", line 317, in _get_aten_graph_module_for_pattern
  aten_pattern = capture_pre_autograd_graph(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xovis/miniconda3/envs/pymmlib2.4/lib/python3.11/site-packages/torch/_export/__init__.py", line 147, in capture_pre_autograd_graph
  assert isinstance(f, torch.nn.Module), "Expected an nn.Module instance."
AssertionError: Expected an nn.Module instance.

In case of other configs I get following error in when doing conversion.

 File "/huge_fast_workdisk/datadisk/deep3share/source/gurkirt_codepad/pymllib/lib/core/engine.py", line 505, in to_tflite
  model = ai_edge_torch.convert(self.model, example_inputs, quant_config=QuantConfig(pt2e_quantizer=self.quantizer))
                                                                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xovis/miniconda3/envs/pymmlib2.4/lib/python3.11/site-packages/ai_edge_torch/quantize/quant_config.py", line 72, in __init__
  if pt2e_quantizer.global_config.is_dynamic
     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'QuantizationConfig' object has no attribute 'is_dynamic'

Any other information you'd like to share?

Exact thing happend with following two enviroments;

pytorch and torch_xla == 2.4 and ai_edge 0.2
pytorch and torch_xla == 2.5.1 and ai_edge 0.2

pkgoogle commented 1 day ago

Hi @gurkirt, what model are you using? i.e. How do you define self.model prior to your code snippet?

gurkirt commented 15 hours ago

I use use yolo style model, its class is defined as follow

class YoloCNN(BaseCNN): ## BaseCNN is  nn.Module
    def __init__(self, args):
        super(YoloCNN, self).__init__(args)
        self.transform_perms = None
        model, save = parse_model(args) ## function on YAML config file
        self.save = save
        logger.info(f"YOLO model SAVE STATE ARE {save}")
        self.model = model ## model is instance of nn.sequential()

    def forward(self, x):
        y, dt = [], []  # outputs
        # pdb.set_trace()
        for m in self.model:
            if m.f != -1:  # if not from previous layer
                x = y[m.f] if isinstance(m.f, int) else [x if j == -1 else y[j] for j in m.f]  # from earlier layers
            # if profile:
            #     self._profile_one_layer(m, x, dt)
            x = m(x)  # run
            y.append(x if m.i in self.save else None)  # save output
            # if visualize:
            #     feature_visualization(x, m.type, m.i, save_dir=visualize)
        return x

And build Yaml contains following content

# Ultralytics YOLO 🚀, AGPL-3.0 license
# YOLOv8-pose-p6 keypoints/pose estimation model. For Usage examples see https://docs.ultralytics.com/tasks/pose

# Parameters
nc: 5  # number of classes
ch: 1
# kpt_shape: [17, 3]  # number of keypoints, number of dims (2 for x,y or 3 for x,y,visible)
scales: # model compound scaling constants, i.e. 'model=yolov8n-p6.yaml' will call yolov8-p6.yaml with scale 'n'
  # [depth, width, max_channels]
  n: [0.33, 0.25, 1024]
  s: [0.33, 0.50, 1024]
  m: [0.67, 0.75, 768]
  l: [1.00, 1.00, 512]
  x: [1.00, 1.25, 512]

# YOLOv8.0x6 backbone
backbone:
  # [from, repeats, module, args]
  - [-1, 1, Conv, [64, 3, 2]]  # 0-P1/2
  - [-1, 1, Conv, [128, 3, 2]]  # 1-P2/4
  - [-1, 3, C2f, [128, True]]
  - [-1, 1, Conv, [256, 3, 2]]  # 3-P3/8
  - [-1, 6, C2f, [256, True]]
  - [-1, 1, Conv, [512, 3, 2]]  # 5-P4/16
  - [-1, 6, C2f, [512, True]]
  - [-1, 1, Conv, [768, 3, 2]]  # 7-P5/32
  - [-1, 3, C2f, [768, True]]
  - [-1, 1, Conv, [1024, 3, 2]]  # 9-P6/64
  - [-1, 3, C2f, [1024, True]]
  - [-1, 1, SPPF, [1024, 5]]  # 11

# YOLOv8.0x6 head
head:
  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 8], 1, Concat, [1]]  # cat backbone P5
  - [-1, 3, C2, [768, False]]  # 14

  - [-1, 1, nn.Upsample, [None, 2, 'nearest']]
  - [[-1, 6], 1, Concat, [1]]  # cat backbone P4
  - [-1, 3, C2, [512, False]]  # 17 (P4/16-small)

  - [-1, 1, Conv, [512, 3, 2]]
  - [[-1, 14], 1, Concat, [1]]  # cat head P5
  - [-1, 3, C2, [768, False]]  # 20 (P5/32-normal)

  - [-1, 1, Conv, [768, 3, 2]]
  - [[-1, 11], 1, Concat, [1]]  # cat head P6
  - [-1, 3, C2, [1024, False]]  # 23 (P6/64-large)

  - [[17, 20, 23], 1, DetectClasswise, [nc, 8]]  # Pose(P4, P5, P6)

I don't think model definition is a problem. QAT has issue only when I am use quantization config from from ai_edge_torch.quantize.pt2e_quantizer import PT2EQuantizer rather than from torch.ao.quantization.quantizer.xnnpack_quantizer import XNNPACKQuantizer

gurkirt commented 15 hours ago

Is there way to use quant config from original torch.ao rather than ai_edge_torch in convert function i.e. model = ai_edge_torch.convert(self.model, example_inputs, quant_config=QuantConfig(pt2e_quantizer=self.quantizer)) How can I extend torch.ao.quantization.quantizer.xnnpack_quantizer import XNNPACKQuantizer or X86InductorQuantizer so ai_edge_torch.convert doesn't throw AttributeError: 'QuantizationConfig' object has no attribute 'is_dynamic'?

Why do we need PT2EQuantizer in first place how can I find find work around?

gurkirt commented 15 hours ago

BTW, PT2EQuantizer works well without any issue when I use PTQ either prepare_pt2e works well. Howeverduring QAT i.e. prepare_qat_pt2e leads to AssertionError: Expected an nn.Module instance when at the time of prepare_qat_pt2e.

gurkirt commented 10 hours ago

Hi, I solved the is_dynamic error by adding is_dynamic as member variable of https://github.com/pytorch/pytorch/blob/main/torch/ao/quantization/quantizer/xnnpack_quantizer_utils.py#L51 and setting it to false. This allows me to do QAT, without any issue an let it convert. I think we can close issue.

And another issue was that I had SiLU activation function for come convolution layers. That was culprit. If I replace SiLU with ReLU then everything works fine, both QAT and PTQ with PT2EQuantizer and as well with XNNPACKQuantizer. Where can I find the list of allowed OPS. Thank you.

pkgoogle commented 5 hours ago

Hi @gurkirt, I see that you reopened it, but I don't see an explanation past your last post. Was this intended?

google-ai-edge / ai-edge-torch