[CenterFusion] Model Full Interger Quantize problem

Issue Type

Others

onnx2tf version number

1.7.6

onnx version number

1.13.1

tensorflow version number

2.12.0rc0

Download URL for ONNX

https://drive.google.com/file/d/1yKCN8_2ayeBLOhd6bgeL55pu_HXClO1L/view?usp=share_link

Parameter Replacement JSON

Sorry! I don't have a Replacement JSON file and I don't know how it works.

Description

Purpose: For the laboratory project, we hope this model can translate into a full integer quantized format. Because we need to run the model on an embedded system. If this project fails, our laboratory will lose money. Could you help us, please? Thank you!

When we transform the model into full integer quantized format, the error occurred as shown below:

Float32 tflite output complete!
Float16 tflite output complete!
Dynamic Range Quantization tflite output complete!
Traceback (most recent call last):
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/onnx2tf/onnx2tf.py", line 1036, in convert
tflite_model = converter.convert()
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 1897, in convert
return super(TFLiteConverterV2, self).convert()
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 962, in wrapper
return self._convert_and_export_metrics(convert_func, *args, **kwargs)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 940, in _convert_and_export_metrics
result = convert_func(self, *args, **kwargs)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 1546, in convert
return super(TFLiteFrozenGraphConverterV2,
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 1172, in convert
return self._optimize_tflite_model(
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/convert_phase.py", line 215, in wrapper
raise error from None  # Re-throws the exception.
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/convert_phase.py", line 205, in wrapper
return func(*args, **kwargs)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 899, in _optimize_tflite_model
model = self._quantize(model, q_in_type, q_out_type, q_activations_type,
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 638, in _quantize
calibrated = calibrate_quantize.calibrate(
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/convert_phase.py", line 215, in wrapper
raise error from None  # Re-throws the exception.
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/convert_phase.py", line 205, in wrapper
return func(*args, **kwargs)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/optimize/calibrator.py", line 226, in calibrate
self._feed_tensors(dataset_gen, resize_input=True)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/optimize/calibrator.py", line 129, in _feed_tensors
self._calibrator.Prepare([list(s.shape) for s in input_array])
RuntimeError: tensorflow/lite/kernels/conv.cc:351 input_channel % filter_input_channel != 0 (6 != 0)Node number 1 (CONV_2D) failed to prepare.
WARNING: Full INT8 Quantization tflite output failed.
Traceback (most recent call last):
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/onnx2tf/onnx2tf.py", line 1111, in convert
tflite_model = converter.convert()
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 1897, in convert
return super(TFLiteConverterV2, self).convert()
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 962, in wrapper
return self._convert_and_export_metrics(convert_func, *args, **kwargs)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 940, in _convert_and_export_metrics
result = convert_func(self, *args, **kwargs)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 1546, in convert
return super(TFLiteFrozenGraphConverterV2,
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 1172, in convert
return self._optimize_tflite_model(
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/convert_phase.py", line 215, in wrapper
raise error from None  # Re-throws the exception.
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/convert_phase.py", line 205, in wrapper
return func(*args, **kwargs)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 899, in _optimize_tflite_model
model = self._quantize(model, q_in_type, q_out_type, q_activations_type,
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 638, in _quantize
calibrated = calibrate_quantize.calibrate(
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/convert_phase.py", line 215, in wrapper
raise error from None  # Re-throws the exception.
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/convert_phase.py", line 205, in wrapper
return func(*args, **kwargs)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/optimize/calibrator.py", line 226, in calibrate
self._feed_tensors(dataset_gen, resize_input=True)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/optimize/calibrator.py", line 129, in _feed_tensors
self._calibrator.Prepare([list(s.shape) for s in input_array])
RuntimeError: tensorflow/lite/kernels/conv.cc:351 input_channel % filter_input_channel != 0 (6 != 0)Node number 1 (CONV_2D) failed to prepare.
WARNING: INT8 Quantization with int16 activations tflite output failed.
Traceback (most recent call last):
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/onnx2tf/onnx2tf.py", line 1145, in convert
tflite_model = converter.convert()
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 1897, in convert
return super(TFLiteConverterV2, self).convert()
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 962, in wrapper
return self._convert_and_export_metrics(convert_func, *args, **kwargs)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 940, in _convert_and_export_metrics
result = convert_func(self, *args, **kwargs)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 1546, in convert
return super(TFLiteFrozenGraphConverterV2,
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 1172, in convert
return self._optimize_tflite_model(
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/convert_phase.py", line 215, in wrapper
raise error from None  # Re-throws the exception.
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/convert_phase.py", line 205, in wrapper
return func(*args, **kwargs)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 899, in _optimize_tflite_model
model = self._quantize(model, q_in_type, q_out_type, q_activations_type,
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/lite.py", line 638, in _quantize
calibrated = calibrate_quantize.calibrate(
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/convert_phase.py", line 215, in wrapper
raise error from None  # Re-throws the exception.
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/convert_phase.py", line 205, in wrapper
return func(*args, **kwargs)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/optimize/calibrator.py", line 226, in calibrate
self._feed_tensors(dataset_gen, resize_input=True)
File "/home/paul/anaconda3/envs/Center_series/lib/python3.8/site-packages/tensorflow/lite/python/optimize/calibrator.py", line 129, in _feed_tensors
self._calibrator.Prepare([list(s.shape) for s in input_array])
RuntimeError: tensorflow/lite/kernels/conv.cc:351 input_channel % filter_input_channel != 0 (6 != 0)Node number 1 (CONV_2D) failed to prepare.
WARNING: Full INT8 Quantization with int16 activations tflite output failed.

How: I try to transform it into h5 and then tranform it into tflite model, but still fail. I also try simplifield the model first and then transfer it into tflite full integer quantize format, but it still fail.
Why: If this project fails, our laboratory will lose money. Please help us. Thank you!
Resource: https://github.com/HaohaoNJU/CenterFusion/blob/main/models/cf_fus.onnx
Your github help me a lot that I already transfer a lot of models from onnx to tflite and even quantize them. Thank you for your contribution! This program is awesome. Could you help us, please? If I have any offense. I apologize for my improper expression and my poor English.

First, this problem requires an understanding of two issues related to the use of TFLite converters and the story of how onnx2tf works.

Your model appears to have 3 channels of input and 64 channels of non-image data, as shown in the image.

The tool will attempt to quantize using the default data set I have prepared if the user does not give data for calibration. The default data set refers to the 3-channel MS-COCO image data set.

https://github.com/PINTO0309/onnx2tf#cli-parameter

  -qcind INPUT_NAME NUMPY_FILE_PATH MEAN STD, \
    --quant_calib_input_op_name_np_data_path INPUT_NAME NUMPY_FILE_PATH MEAN STD
    INPUT Name of OP and path of calibration data file (Numpy) for quantization and mean and std.
    The specification can be omitted only when the input OP is a single 4D tensor image data.
    If omitted, it is automatically calibrated using 20 normalized MS-COCO images.
    The type of the input OP must be Float32.
    Data for calibration must be pre-normalized to a range of 0 to 1.
    -qcind {input_op_name} {numpy_file_path} {mean} {std}
    Numpy file paths must be specified the same number of times as the number of input OPs.
    Normalize the value of the input OP based on the tensor specified in mean and std.
    (input_value - mean) / std
    Tensors in Numpy file format must be in dimension order after conversion to TF.
    Note that this is intended for deployment on low-resource devices,
    so the batch size is limited to 1 only.

    e.g.
    The example below shows a case where there are three input OPs.
    Assume input0 is 128x128 RGB image data.
    In addition, input0 should be a value that has been divided by 255
    in the preprocessing and normalized to a range between 0 and 1.
    input1 and input2 assume the input of something that is not an image.
    Because input1 and input2 assume something that is not an image,
    the divisor is not 255 when normalizing from 0 to 1.
    "n" is the number of calibration data.

    ONNX INPUT shapes:
      input0: [n,3,128,128]
        mean: [1,3,1,1] -> [[[[0.485]],[[0.456]],[[0.406]]]]
        std : [1,3,1,1] -> [[[[0.229]],[[0.224]],[[0.225]]]]
      input1: [n,64,64]
        mean: [1,64] -> [[0.1, ..., 0.64]]
        std : [1,64] -> [[0.05, ..., 0.08]]
      input2: [n,5]
        mean: [1] -> [0.3]
        std : [1] -> [0.07]

    TensorFlow INPUT shapes (Numpy file ndarray shapes):
      input0: [n,128,128,3]
        mean: [1,1,1,3] -> [[[[0.485, 0.456, 0.406]]]]
        std : [1,1,1,3] -> [[[[0.229, 0.224, 0.225]]]]
      input1: [n,64,64]
        mean: [1,64] -> [[0.1, ..., 0.64]]
        std : [1,64] -> [[0.05, ..., 0.08]]
      input2: [n,5]
        mean: [1] -> [0.3]
        std : [1] -> [0.07]

    -qcind "input0" "../input0.npy" [[[[0.485, 0.456, 0.406]]]] [[[[0.229, 0.224, 0.225]]]]
    -qcind "input1" "./input1.npy" [[0.1, ..., 0.64]] [[0.05, ..., 0.08]]
    -qcind "input2" "input2.npy" [0.3] [0.07]

Debugging allows you to see the shape of the data set for calibration. It is shown in the figure below.

This means that if you need to use special input data other than images, you will need to pass the calibration data set to TFLite Converter yourself. The 64-channel data is too special, and it is not possible to determine just from the structure of the model what kind of calibration should be done for onnx2tf in the first place. Therefore, you need to generate your own Numpy dataset and pass it to onnx2tf, as described in the README tutorial transcribed above.

Also, to give you an idea of a potential problem that may arise a little further down the road in exchanging this topic, it is possible that TFLite Converter may not be able to successfully complete quantization for models with more than two inputs. This is not a problem with onnx2tf, but rather with the specification of the TFLite Converter itself.

Error judgment message for quantization calibration has been improved. https://github.com/PINTO0309/onnx2tf/releases/tag/1.7.14

Improved error judgment regarding calibration data for INT8 quantization.
Models with multiple input OPs and non-rgb-image input OPs force automatic calibration to be aborted.

e.g. cf_fus.onnx.zip

if model_input.dtype != tf.float32 \
  or len(model_input.shape) != 4 \
  or model_input.shape[-1] != 3:
  print(
      f'{Color.RED}ERROR:{Color.RESET} ' +
      f'For models that have multiple input OPs and need to perform INT8 quantization calibration '+
      f'using non-rgb-image input tensors, specify the calibration data with '+
      f'--quant_calib_input_op_name_np_data_path. '+
      f'model_input[n].shape: {model_input.shape}'
  )
  sys.exit(1)

If there is no activity within the next two days, this issue will be closed automatically.

@PINTO0309 Thank you very much! I will try to generate my own npy file for quantization. Even I don't know how to generate input tensor, mean, and std.

I can only describe a general method for creating calibration data for quantization.

Collect 10 to 100 pieces of data used for training.

Convert data to .npy.

For image data

import cv2
import glob
import numpy as np

# Not used during data generation ################################
# You will need to do the calculations yourself using the test data
MEAN = np.asarray([[[[0.485, 0.456, 0.406]]]], dtype=np.float32) # [1,1,1,3]
STD = np.asarray([[[[0.229, 0.224, 0.225]]]], dtype=np.float32) # [1,1,1,3]
# Not used during data generation ################################

files = glob.glob("data/*.png")
img_datas = []
for idx, file in enumerate(files):
  bgr_img = cv2.imread(file)
  rgb_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2RGB)
  resized_img = cv2.resize(rgb_img, dsize=(200,112))
  extend_batch_size_img = resized_img[np.newaxis, :]
  normalized_img = extend_batch_size_img / 255.0 # 0.0 - 1.0
  print(
      f'{str(idx+1).zfill(2)}. extend_batch_size_img.shape: {extend_batch_size_img.shape}'
  ) # [1,112,200,3]
  img_datas.append(extend_batch_size_img)
calib_datas = np.vstack(img_datas)
print(f'calib_datas.shape: {calib_datas.shape}') # [10,112,200,3]
np.save(file='data/calibdata.npy', arr=calib_datas)

loaded_data = np.load('data/calibdata.npy')
print(f'loaded_data.shape: {loaded_data.shape}') # [10,112,200,3]

"""
-qcind INPUT_NAME NUMPY_FILE_PATH MEAN STD
int8_calib_datas = (loaded_data - MEAN) / STD # -1.0 - 1.0

e.g.
-qcind pc_dep 'data/calibdata.npy' [[[[0.485, 0.456, 0.406]]]] [[[[0.229, 0.224, 0.225]]]]
"""

Simply stack np.ndarray vertically to match the shape of the model's input OP. It is the same whether it is an image or not.

If there is no activity within the next two days, this issue will be closed automatically.

I can only describe a general method for creating calibration data for quantization.

Collect 10 to 100 pieces of data used for training.

Convert data to .npy.

For image data

import cv2
import glob
import numpy as np

# Not used during data generation ################################
# You will need to do the calculations yourself using the test data
MEAN = np.asarray([[[[0.485, 0.456, 0.406]]]], dtype=np.float32) # [1,1,1,3]
STD = np.asarray([[[[0.229, 0.224, 0.225]]]], dtype=np.float32) # [1,1,1,3]
# Not used during data generation ################################

files = glob.glob("data/*.png")
img_datas = []
for idx, file in enumerate(files):
 bgr_img = cv2.imread(file)
 rgb_img = cv2.cvtColor(bgr_img, cv2.COLOR_BGR2RGB)
 resized_img = cv2.resize(rgb_img, dsize=(200,112))
 extend_batch_size_img = resized_img[np.newaxis, :]
 normalized_img = extend_batch_size_img / 255.0 # 0.0 - 1.0
 print(
     f'{str(idx+1).zfill(2)}. extend_batch_size_img.shape: {extend_batch_size_img.shape}'
 ) # [1,112,200,3]
 img_datas.append(extend_batch_size_img)
calib_datas = np.vstack(img_datas)
print(f'calib_datas.shape: {calib_datas.shape}') # [10,112,200,3]
np.save(file='data/calibdata.npy', arr=calib_datas)

loaded_data = np.load('data/calibdata.npy')
print(f'loaded_data.shape: {loaded_data.shape}') # [10,112,200,3]

"""
-qcind INPUT_NAME NUMPY_FILE_PATH MEAN STD
int8_calib_datas = (loaded_data - MEAN) / STD # -1.0 - 1.0

e.g.
-qcind pc_dep 'data/calibdata.npy' [[[[0.485, 0.456, 0.406]]]] [[[[0.229, 0.224, 0.225]]]]
"""

Simply stack np.ndarray vertically to match the shape of the model's input OP. It is the same whether it is an image or not.

Thank you for your detailed explanations! We had transferred the model into full quantized int8 format. https://drive.google.com/file/d/1AG0azm668iqQTEDk9_Cip4tR8s9Swuga/view?usp=share_link(https://l.facebook.com/l.php?u=https%3A%2F%2Fdrive.google.com%2Ffile%2Fd%2F1AG0azm668iqQTEDk9_Cip4tR8s9Swuga%2Fview%3Fusp%3Dshare_link%26fbclid%3DIwAR3jVnUfCUoL-YwiyzGfgnbJtvnnO1sak73fjwAgbNs2du9vcxdCzOWRTVI&h=AT2q3-Pz_rDZwhcO5u3ERVvHJRKPUq9Gi9QsXq_ACrroqfQnKm4oT7lj87Bka4SQImGD5mT3IFua0ubHL7sdn1RHNIseFgHQ1FNTuLgnIgUo8WBM01FAgOSouLrro-CV1_ZSMg)

I can only judge the structure of the model, but it looks kind of good.

I am not sure how much the accuracy is degraded, but if the accuracy deteriorates, please refer to the tutorial on INT8 quantization that I have added to the README in the last couple of days.

https://github.com/PINTO0309/onnx2tf#6-if-the-accuracy-of-the-int8-quantized-model-degrades-significantly

I will close this issue as the problem seems to be resolved.

PINTO0309 / onnx2tf