Static quantization of a model fails while accessing attributes of a Clip node

emilianavt commented 3 years ago

Describe the bug Attempting to perform static quantization on a model fails with the following error:

[...]
  File "/home/emiliana/env/lib/python3.7/site-packages/onnxruntime/quantization/calibrate.py", line 175, in calculate_scale_zeropoint
    clip_min = next_node.attribute[0].f
IndexError: list index (0) out of range

Urgency none

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Debian Linux
ONNX Runtime installed from (source or binary): binary installed with pip3
ONNX Runtime version: 1.5.2
Python version: 3.7.3
Visual Studio version (if applicable): not applicable
GCC/Compiler version (if compiling from source): not applicable
CUDA/cuDNN version: not applicable
GPU model and memory: not applicable

To Reproduce I used this code to perform the quantization:

import onnx
from onnxruntime.quantization import quantize_dynamic, QuantType
from onnxruntime.quantization import quantize_static, calibrate, CalibrationDataReader
import cv2
import numpy as np

class NFDataReader(CalibrationDataReader):
    def __init__(self):
        self.augmented_model_path = "augmented_model.onnx"
        self.enum_data_dicts = []
        self.datasize = 0
        mean = np.float32(np.array([0.485, 0.456, 0.406]))
        std = np.float32(np.array([0.229, 0.224, 0.225]))
        mean = mean / std
        std = std * 255.0

        for i in range(1):
            image = cv2.imread(f"{i}.png")
            image = image[:,:,::-1] * 1 / std - mean
            image = np.expand_dims(image, 0).astype(np.float32)
            image = np.transpose(image, (0,3,1,2))
            self.enum_data_dicts.append({"input": image})
        self.datasize = len(self.enum_data_dicts)
        self.enum_data_dicts = iter(self.enum_data_dicts)

    def get_next(self):
        return next(self.enum_data_dicts, None)

model_fp32 = 'lm_model3.onnx'
model_quant = 'lm_model3_quant.onnx'
# run it
dr = NFDataReader()
quantized_model = quantize_static(model_input=model_fp32, model_output=model_quant, calibration_data_reader=dr, weight_type=QuantType.QUInt8)

Running the code without the np.transpose line and an adjusted input name works for the resnet model provided here so the issue seems to be model dependent.

The attached zip file (quant.zip) contains the original ONNX model, the png file as well as the code.

Expected behavior I expected a quantized model to be produced rather than an error to occur.

Screenshots The full stack trace:

Traceback (most recent call last):
  File "quant_model.py", line 34, in <module>
    quantized_model = quantize_static(model_input=model_fp32, model_output=model_quant, calibration_data_reader=dr, weight_type=QuantType.QUInt8)
  File "/home/emiliana/env/lib/python3.7/site-packages/onnxruntime/quantization/quantize.py", line 171, in quantize_static
    nodes_to_exclude)
  File "/home/emiliana/env/lib/python3.7/site-packages/onnxruntime/quantization/calibrate.py", line 260, in calibrate
    quantization_params_dict = calibrater.calculate_quantization_params(dict_for_quantization)
  File "/home/emiliana/env/lib/python3.7/site-packages/onnxruntime/quantization/calibrate.py", line 230, in calculate_quantization_params
    node_params = self.calculate_scale_zeropoint(child, node_thresholds[0], node_thresholds[1])
  File "/home/emiliana/env/lib/python3.7/site-packages/onnxruntime/quantization/calibrate.py", line 175, in calculate_scale_zeropoint
    clip_min = next_node.attribute[0].f
IndexError: list index (0) out of range

Additional context

The model was originally converted from pytorch using opset 11. It has a MobileNet V3 backend based on the geffnet implementation, with UNet-like layers on top to do heatmap regression, containing Conv2D and bilinear upscale layers. The definition of the model (it uses inference=False) can be found here. Other than static quantization, the ONNX model works fine with onnxruntime.

xuhao1 commented 3 years ago

Describe the bug Attempting to perform static quantization on a model fails with the following error:
[...]
  File "/home/emiliana/env/lib/python3.7/site-packages/onnxruntime/quantization/calibrate.py", line 175, in calculate_scale_zeropoint
    clip_min = next_node.attribute[0].f
IndexError: list index (0) out of range
Urgency none

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Debian Linux

ONNX Runtime installed from (source or binary): binary installed with pip3

ONNX Runtime version: 1.5.2

Python version: 3.7.3

Visual Studio version (if applicable): not applicable

GCC/Compiler version (if compiling from source): not applicable

CUDA/cuDNN version: not applicable

GPU model and memory: not applicable

To Reproduce I used this code to perform the quantization:
import onnx
from onnxruntime.quantization import quantize_dynamic, QuantType
from onnxruntime.quantization import quantize_static, calibrate, CalibrationDataReader
import cv2
import numpy as np

class NFDataReader(CalibrationDataReader):
    def __init__(self):
        self.augmented_model_path = "augmented_model.onnx"
        self.enum_data_dicts = []
        self.datasize = 0
        mean = np.float32(np.array([0.485, 0.456, 0.406]))
        std = np.float32(np.array([0.229, 0.224, 0.225]))
        mean = mean / std
        std = std * 255.0

        for i in range(1):
            image = cv2.imread(f"{i}.png")
            image = image[:,:,::-1] * 1 / std - mean
            image = np.expand_dims(image, 0).astype(np.float32)
            image = np.transpose(image, (0,3,1,2))
            self.enum_data_dicts.append({"input": image})
        self.datasize = len(self.enum_data_dicts)
        self.enum_data_dicts = iter(self.enum_data_dicts)

    def get_next(self):
        return next(self.enum_data_dicts, None)

model_fp32 = 'lm_model3.onnx'
model_quant = 'lm_model3_quant.onnx'
# run it
dr = NFDataReader()
quantized_model = quantize_static(model_input=model_fp32, model_output=model_quant, calibration_data_reader=dr, weight_type=QuantType.QUInt8)
Running the code without the np.transpose line and an adjusted input name works for the resnet model provided here so the issue seems to be model dependent.

The attached zip file (quant.zip) contains the original ONNX model, the png file as well as the code.

Expected behavior I expected a quantized model to be produced rather than an error to occur.

Screenshots The full stack trace:
Traceback (most recent call last):
  File "quant_model.py", line 34, in <module>
    quantized_model = quantize_static(model_input=model_fp32, model_output=model_quant, calibration_data_reader=dr, weight_type=QuantType.QUInt8)
  File "/home/emiliana/env/lib/python3.7/site-packages/onnxruntime/quantization/quantize.py", line 171, in quantize_static
    nodes_to_exclude)
  File "/home/emiliana/env/lib/python3.7/site-packages/onnxruntime/quantization/calibrate.py", line 260, in calibrate
    quantization_params_dict = calibrater.calculate_quantization_params(dict_for_quantization)
  File "/home/emiliana/env/lib/python3.7/site-packages/onnxruntime/quantization/calibrate.py", line 230, in calculate_quantization_params
    node_params = self.calculate_scale_zeropoint(child, node_thresholds[0], node_thresholds[1])
  File "/home/emiliana/env/lib/python3.7/site-packages/onnxruntime/quantization/calibrate.py", line 175, in calculate_scale_zeropoint
    clip_min = next_node.attribute[0].f
IndexError: list index (0) out of range
Additional context

The model was originally converted from pytorch using opset 11. It has a MobileNet V3 backend based on the geffnet implementation, with UNet-like layers on top to do heatmap regression, containing Conv2D and bilinear upscale layers. The definition of the model (it uses inference=False) can be found here. Other than static quantization, the ONNX model works fine with onnxruntime.

I also found this issue when quantize landmark model in OpenSeeFace, and I solved it by hard code the source code of home/emiliana/env/lib/python3.7/site-packages/onnxruntime/quantization/calibrate.py to clip min = 0, max = 6 since in your network, all clip is from 0 to 6.

In addition, ONNX quant does not speed up the landmark model at all on my cpu(E5) but makes result worse even after calibration with whole dataset.

j-paulus commented 3 years ago

I can confirm the bug. It is possible to reproduce it by modifying the example from #5499 with an additional clamp() operation in the model:

import numpy as np
import torch
from onnxruntime.quantization import quantize_dynamic, quantize_static, QuantType
from onnxruntime.quantization.calibrate import CalibrationDataReader

class CalibrationDataProvider(CalibrationDataReader):
    def __init__(self):
        super(CalibrationDataProvider, self).__init__()
        self.counter = 0

    def get_next(self):
        if self.counter > 2:
            return None
        else:
            self.counter += 1
            return {'x': np.random.randn(2, 4).astype(np.float32)}

class Model(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.n = int(1)

    def forward(self, x):
        f = x.shape[0]
        y = x.reshape(-1, f)
        z = y.clamp(min=-1.0, max=1.0)  # this is the problem now
        return z

model = Model().float()

dummy_input = (torch.randn(2, 4), )
torch.onnx.export(
    model,
    dummy_input,
    'model.onnx',
    input_names=('x',),
    export_params=True,
    training=False,
    opset_version=11)

cdr = CalibrationDataProvider()

quantize_static(model_input='model.onnx',
                model_output='model_q.onnx',
                calibration_data_reader=cdr)

This simple model shows that the Clip-node has an empty attributes list and the min and max parameters are in fact additional inputs. Maybe this is relevant for the bug.

ONNX Runtime version 1.6.0.

yufenglee commented 3 years ago

Fixed in #6541

microsoft / onnxruntime

Static quantization of a model fails while accessing attributes of a Clip node #5586