microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.51k stars 2.9k forks source link

Weight's raw_data is empty so it leads to quantize weight failure #9635

Open michaelnguyen11 opened 2 years ago

michaelnguyen11 commented 2 years ago

Describe the bug We converted a Pytorch model to ONNX model. The ONNX model can inference with OnnxRuntime. Now We want to quantize that model to run on mobile device, so I used static quantization method to quantize that model. However the quantization process is failed with error code :

  File "quantize_cpu.py", line 155, in main
    quantize_static(input_model_path,
  File "/home/test/anaconda3/envs/test/lib/python3.8/site-packages/onnxruntime/quantization/quantize.py", line 234, in quantize_static
    quantizer.quantize_model()
  File "/home/test/anaconda3/envs/test/lib/python3.8/site-packages/onnxruntime/quantization/onnx_quantizer.py", line 284, in quantize_model
    op_quantizer.quantize()
  File "/home/test/anaconda3/envs/test/lib/python3.8/site-packages/onnxruntime/quantization/operators/concat.py", line 15, in quantize
    (q_input_names, zero_point_names, scale_names, nodes) = self.quantizer.quantize_inputs(node, [*range(0, len(node.input))])
  File "/home/test/anaconda3/envs/test/lib/python3.8/site-packages/onnxruntime/quantization/onnx_quantizer.py", line 643, in quantize_inputs
    q_weight_name, zp_name, scale_name = self.quantize_weight(
  File "/home/test/anaconda3/envs/test/lib/python3.8/site-packages/onnxruntime/quantization/onnx_quantizer.py", line 709, in quantize_weight
    _, _, zero_point, scale, q_weight_data = quantize_data(weight_data.flatten().tolist(),
  File "/home/test/anaconda3/envs/test/lib/python3.8/site-packages/onnxruntime/quantization/quant_utils.py", line 166, in quantize_data
    rmin = min(data)
ValueError: min() arg is an empty sequence

Dived into debug this, I found that : because raw_data of that weight name is empty, therefore the data passed to quantize_data function is empty, hence min() function error.

Print all nodes of ONNX model, raw data of this weight name is empty.

  initializer {
    dims: 1
    dims: 1
    dims: 1
    dims: 30
    dims: 0
    data_type: 1
    name: "482"
    raw_data: ""
  }

Urgency Yes, this model is our release in Q4/2021. So we need to fix it as soon as possible.

System information

To Reproduce My quantization code:

import os
import numpy as np
import argparse
import time
from PIL import Image
from numpy.lib.function_base import bartlett
import cv2

import onnx
import onnxruntime
from onnxruntime.quantization import quantize_static, CalibrationDataReader, QuantFormat, QuantType

class RealtimeStereoDataReader(CalibrationDataReader):
    def __init__(self, calibration_image_folder, augmented_model_path):
        self.image_folder = calibration_image_folder
        self.augmented_model_path = augmented_model_path
        self.preprocess_flag = True
        self.enum_data_dicts = None

        self.session = onnxruntime.InferenceSession(self.augmented_model_path, None)

        self.inname = [input.name for input in self.session.get_inputs()]
        self.outname = [output.name for output in self.session.get_outputs()]

        print("inname {}, outname {}".format(self.inname, self.outname))

        # get model input info
        self.input_shape = self.session.get_inputs()[0].shape
        self.channels = self.input_shape[2]
        self.input_height = self.input_shape[2]
        self.input_width = self.input_shape[3]
        print("input_shape {}".format(self.input_shape))

        # get model output info
        self.output_shape = self.session.get_outputs()[0].shape
        print("output_shape {}".format(self.output_shape))

    def get_next(self):
        if self.preprocess_flag:
            self.preprocess_flag = False
            data = self.loadKittiDataset(self.image_folder)
            self.enum_data_dicts = iter(data)

        return next(self.enum_data_dicts, None)

    def loadKittiDataset(self, images_folder):
        left_fold  = 'image_2/'
        right_fold = 'image_3/'

        image = [img for img in os.listdir(images_folder + left_fold) if img.find('_10') > -1]

        left_path_list  = [images_folder + left_fold + img for img in image]
        right_path_list = [images_folder + right_fold + img for img in image]

        data = []

        for idx in range(len(left_path_list)):
            left_input = self.preprocess(left_path_list[idx])
            right_input = self.preprocess(right_path_list[idx])
            data.append({self.inname[0] : left_input, self.inname[1] : right_input})

        return data

    def preprocess(self, image_path):
        image = cv2.imread(image_path)
        img = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
        img_input = cv2.resize(img, (self.input_width,self.input_height)).astype(np.float32)

        # Scale input pixel values to -1 to 1
        mean=[0.485, 0.456, 0.406]
        std=[0.229, 0.224, 0.225]

        img_input = ((img_input/ 255.0 - mean) / std)
        img_input = img_input.transpose(2, 0, 1)
        img_input = img_input[np.newaxis,:,:,:]

        return img_input.astype(np.float32)

def main():
    parser = argparse.ArgumentParser()
    parser.add_argument("--input_model", required=True, help="input model")
    parser.add_argument("--output_model", required=True, help="output model")
    parser.add_argument("--calibrate_dataset", default="./test_images", help="calibration data set")
    parser.add_argument("--quant_format",
                        default=QuantFormat.QOperator,
                        type=QuantFormat.from_string,
                        choices=list(QuantFormat))
    parser.add_argument("--per_channel", default=True, type=bool)
    args = parser.parse_args()

    input_model_path = args.input_model
    output_model_path = args.output_model
    calibration_dataset_path = args.calibrate_dataset

    # onnx_model = onnx.load(input_model_path)
    # print('all layers of onnx model :')
    # print(onnx_model)
    # print('=======================')

    # print('calibration_dataset_path : {}'.format(calibration_dataset_path))
    dr = RealtimeStereoDataReader(calibration_dataset_path, input_model_path)

    quantize_static(input_model_path,
                    output_model_path,
                    dr,
                    quant_format=args.quant_format,
                    per_channel=args.per_channel,
                    weight_type=QuantType.QUInt8)

    print('Calibrated and quantized model saved.')

if __name__ == '__main__':
    main()

Run : python3 quantize_cpu.py --input_model realtimeStereo.onnx --output_model realtimeStereo_quant.onnx --calibrate_dataset ./test_images

Attached test_images and onnx model : test_images.zip

Expected behavior Conversion should complete as normal.

yufenglee commented 2 years ago

@michaelnguyen11, it is because the tensor '482' is an empty tensor. Its size is 0. Why do you Concat an empty tensor though? It is useless.

michaelnguyen11 commented 2 years ago

Hi @yufenglee ,

Thanks for the information. I converted from Pytorch model to ONNX model , don't know why an empty tensor exists in ONNX model.

I'll check it the conversation again and remove unused tensor.

michaelnguyen11 commented 2 years ago

Hi @yufenglee m

I used pull request https://github.com/microsoft/onnxruntime/pull/9640 , it worked well and I can quantize the model.

However it have 2 problems : 1/ The output of quantized model is worse than original model.

2/ I get a bunch of warnings like the following:

2021-11-02 11:43:38.539284273 [W:onnxruntime:, graph.cc:3391 CleanUnusedInitializers] Removing initializer '2249_quantized_scale'. It is not used by any node and should be removed from the model.
2021-11-02 11:43:38.539307329 [W:onnxruntime:, graph.cc:3391 CleanUnusedInitializers] Removing initializer '2246_quantized_zero_point'. It is not used by any node and should be removed from the model.
2021-11-02 11:43:38.539313971 [W:onnxruntime:, graph.cc:3391 CleanUnusedInitializers] Removing initializer '2246_quantized_scale'. It is not used by any node and should be removed from the model.
2021-11-02 11:43:38.539319817 [W:onnxruntime:, graph.cc:3391 CleanUnusedInitializers] Removing initializer '2243_quantized_zero_point'. It is not used by any node and should be removed from the model.
2021-11-02 11:43:38.539325362 [W:onnxruntime:, graph.cc:3391 CleanUnusedInitializers] Removing initializer '2243_quantized_scale'. It is not used by any node and should be removed from the model.
2021-11-02 11:43:38.539331112 [W:onnxruntime:, graph.cc:3391 CleanUnusedInitializers] Removing initializer '2240_quantized_scale'. It is not used by any node and should be removed from the model.
2021-11-02 11:43:38.539338720 [W:onnxruntime:, graph.cc:3391 CleanUnusedInitializers] Removing initializer '2147_quantized_zero_point'. It is not used by any node and should be removed from the model.
2021-11-02 11:43:38.539346504 [W:onnxruntime:, graph.cc:3391 CleanUnusedInitializers] Removing initializer '2144_quantized_zero_point'. It is not used by any node and should be removed from the model.
2021-11-02 11:43:38.539352945 [W:onnxruntime:, graph.cc:3391 CleanUnusedInitializers] Removing initializer '2144_quantized_scale'. It is not used by any node and should be removed from the model.

Could you please help me how to remove it ? I tried a solution in https://github.com/microsoft/onnxruntime/issues/1899 , but I got a coredump when follow that solution. The code that I used:

import onnx
import onnxoptimizer

onnxfile = 'realtimeStereo_quant.onnx'
onnx_model = onnx.load(onnxfile)
passes = ["extract_constant_to_initializer", "eliminate_unused_initializer"]
optimized_model = onnxoptimizer.optimize(onnx_model, passes)

onnx.save(optimized_model, onnxfile)
stale[bot] commented 2 years ago

This issue has been automatically marked as stale due to inactivity and will be closed in 7 days if no further activity occurs. If further support is needed, please provide an update and/or more details.