Static Quantization of Model with Dynamic Shaped Input

luisfmnunes commented 2 years ago

System information

OS Platform and Distribution (e.g., Linux Ubuntu 20.04):
ONNX Runtime installed from (source or binary): pip
ONNX Runtime version: 1.11.0
Python version: 3.8.10

Is it possible to quantize a model with dynamic shaped inputs statically? I'm trying to quantize a ResNet50 Model statically, but since the input is dynamic ('batch', 3, 'height', 'width') the quantization method is raising numeroues "Expected Shape" warning, failing to quantize the model in the end. There is a sample from the output and the quantize_static call below.

from onnxruntime.quantization import quantize_static

class ObjectDetectionDataReader(CalibrationDataReader):
    def __init__(self, model_path):
        self.model_path = model_path
        self.preprocess_flag = None
        self.start_index = 0
        self.end_index = 0
        self.stride = 1
        self.batch_size = 1
        self.enum_data_dicts = iter([])
        self.input_name = None
        self.get_input_name()

    def get_batch_size(self):
        return self.batch_size

    def get_input_name(self):
        if self.input_name:
            return
        sess = ort.InferenceSession(self.model_path)
        self.input_name = sess.get_inputs()[0].name

class WiderFaceDataReader(ObjectDetectionDataReader):
    def __init__(self,
                 calibration_image_folder,
                 width=-1,
                 height=-1,
                 start_index=0,
                 end_index=0,
                 stride=1,
                 batch_size=1,
                 model_path="",
                 is_evaluation=False,
                 annotations="",
                 preprocess_func=preprocessing_folder):
        ObjectDetectionDataReader.__init__(self, model_path)
        self.image_folder = calibration_image_folder
        self.model_path = model_path
        self.preprocess_flag = True
        self.enum_data_dicts = iter([])

        sess = ort.InferenceSession(model_path)
        self.width = width if width > 0 else sess.get_inputs()[0].shape[3]
        self.height = height if height > 0 else sess.get_inputs()[0].shape[2]
        self.input_name = sess.get_inputs()[0].name

        self.start_index = start_index
        self.end_index = len(os.listdir(calibration_image_folder)) if not end_index else end_index
        self.stride = stride if stride >= 1 else 1
        self.batch_size = batch_size
        self.is_evaluation = is_evaluation

        self.preprocess_func = preprocess_func

    def get_dataset_size(self):
        return len(os.listdir(self.image_folder))

    def get_next(self):
        if self.preprocess_flag:
            self.preprocess_flag = False
            data_list = self.preprocess_func(self.image_folder, self.height, self.width)
            self.datasize = len(data_list)
            self.enum_data_dicts = iter([{self.input_name: data} for data in data_list])
        return next(self.enum_data_dicts, None)

def quantize_model(input_model, output_model, dr):
    quantize_static(input_model, output_model, dr, optimize_model=False)
    print("Calibrated and quantized model saved")

...
2022-07-14 11:44:38.717490013 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {} does not match actual shape of {1,1,1} for output 823_ReduceMin
2022-07-14 11:44:38.760698640 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {} does not match actual shape of {1,1,1} for output 737_ReduceMax
2022-07-14 11:44:38.760747862 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {} does not match actual shape of {1,1,1} for output 737_ReduceMin
2022-07-14 11:44:38.760862334 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {} does not match actual shape of {1,1,1} for output 761_ReduceMax
2022-07-14 11:44:38.760872703 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {} does not match actual shape of {1,1,1} for output 761_ReduceMin
2022-07-14 11:44:38.761076119 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {} does not match actual shape of {1,1,1} for output 835_ReduceMax
2022-07-14 11:44:38.761090165 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {} does not match actual shape of {1,1,1} for output 835_ReduceMin
2022-07-14 11:44:38.775354916 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {} does not match actual shape of {1,1,1} for output 798_ReduceMax
2022-07-14 11:44:38.775377910 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {} does not match actual shape of {1,1,1} for output 798_ReduceMin
2022-07-14 11:44:38.790627858 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {1} does not match actual shape of {0} for output 654_ReduceMax
2022-07-14 11:44:38.790648201 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {1} does not match actual shape of {0} for output 654_ReduceMin
2022-07-14 11:44:38.792218682 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {} does not match actual shape of {1,1,1} for output 799_ReduceMax
2022-07-14 11:44:38.792249767 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {} does not match actual shape of {1,1,1} for output 799_ReduceMin
2022-07-14 11:44:38.792280872 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {} does not match actual shape of {1,1,1} for output 749_ReduceMax
2022-07-14 11:44:38.792296948 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {} does not match actual shape of {1,1,1} for output 749_ReduceMin
2022-07-14 11:44:38.801004963 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {1} does not match actual shape of {0} for output 655_ReduceMax
2022-07-14 11:44:38.801032955 [W:onnxruntime:, execution_frame.cc:806 VerifyOutputSizes] Expected shape from model of {1} does not match actual shape of {0} for output 655_ReduceMin
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Input In [9], in <cell line: 1>()
----> 1 quantize_model(model_fp32, model_quant, wfdr)

Input In [7], in quantize_model(input_model, output_model, dr)
     84 def quantize_model(input_model, output_model, dr):
---> 85     quantize_static(input_model, output_model, dr, optimize_model=False)
     86     print("Calibrated and quantized model saved")

File ~/openvino/penv/lib/python3.8/site-packages/onnxruntime/quantization/quantize.py:291, in quantize_static(model_input, model_output, calibration_data_reader, quant_format, op_types_to_quantize, per_channel, reduce_range, activation_type, weight_type, nodes_to_quantize, nodes_to_exclude, optimize_model, use_external_data_format, calibrate_method, extra_options)
    276     quantizer = QDQQuantizer(
    277         model,
    278         per_channel,
   (...)
    287         op_types_to_quantize,
    288         extra_options)
    290 quantizer.quantize_model()
--> 291 quantizer.model.save_model_to_file(model_output, use_external_data_format)

File ~/openvino/penv/lib/python3.8/site-packages/onnxruntime/quantization/onnx_model.py:249, in ONNXModel.save_model_to_file(self, output_path, use_external_data_format)
    245 def save_model_to_file(self, output_path, use_external_data_format=False):
    246     '''
    247     Save model to external data, which is needed for model size > 2GB
    248     '''
--> 249     self.topological_sort()
    250     if use_external_data_format:
    251         onnx.external_data_helper.convert_model_to_external_data(self.model,
    252                                                                  all_tensors_to_one_file=True,
    253                                                                  location=Path(output_path).name + ".data")

File ~/openvino/penv/lib/python3.8/site-packages/onnxruntime/quantization/onnx_model.py:356, in ONNXModel.topological_sort(self)
    353                     end = end + 1
    354     start = start + 1
--> 356 assert(end == len(self.graph().node)), "Graph is not a DAG"
    357 self.graph().ClearField('node')
    358 self.graph().node.extend(sorted_nodes)

AssertionError: Graph is not a DAG

yufenglee commented 2 years ago

@luisfmnunes, the warning issued is fixed with pr #11647 in master. You can try our 1.12 release candidate: https://test.pypi.org/project/ort-nightly/1.12.0.dev20220707003/.

For 'Graph is not a DAG' issue, could you please share the model and a sample data to repro if possible?

luisfmnunes commented 2 years ago

@luisfmnunes, the warning issued is fixed with pr #11647 in master. You can try our 1.12 release candidate: https://test.pypi.org/project/ort-nightly/1.12.0.dev20220707003/.

For 'Graph is not a DAG' issue, could you please share the model and a sample data to repro if possible?

Thank you for your response @yufenglee. The model and some sample data are available in this GoogleDrive link because it surpasses github file size. I'll also try it out with the given RC.

Forgot to provide the preprocessing functions as follows (requires numpy and opencv-python):

def preprocessing(im, height=640, width=640, reshape=False):

    im = np.float32(im)
    im -= (104, 117, 123)
    det_scale = 1

    if reshape:
        im_ratio = float(im.shape[0]) / im.shape[1]
        model_ratio = 1
        if im_ratio>model_ratio:
            new_height = height
            new_width = int(new_height / im_ratio)
        else:
            new_width = width
            new_height = int(new_width * im_ratio)
        det_scale = float(new_height) / im.shape[0]
        resized_img = cv2.resize(im, (new_width, new_height))

        det_im = np.zeros((height,width,3),dtype=np.float32)
        det_im [:new_height,:new_width,:] = resized_img

        im = det_im

    im = im.transpose(2,0,1)
    im = im[np.newaxis, ...]

    return im, det_scale

def preprocessing_folder(images_folder, height, width, size_limit=0):

    image_names = [im for im in os.listdir(images_folder) if os.path.isfile(os.path.join(images_folder,im))]
    if size_limit > 0 and len(image_names) >= size_limit:
        batch_filenames = image_names[:size_limit]
    else:
        batch_filenames = image_names

    unconcatenated_batch_data = []

    for image_name in batch_filenames:
        im = cv2.imread(os.path.join(images_folder,image_name))
        im, _ = preprocessing(im)
#         print(im.shape)
        unconcatenated_batch_data.append(im)
    return unconcatenated_batch_data

microsoft / onnxruntime

Static Quantization of Model with Dynamic Shaped Input #12178