Can it support mixed precision quantization: one part with full int8 quantized and the rest with int16_act quantized

PINTO0309 / onnx2tf

Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.

MIT License

706 stars 73 forks source link

Can it support mixed precision quantization: one part with full int8 quantized and the rest with int16_act quantized #678

Closed hayyaw closed 3 months ago

hayyaw commented 3 months ago

Issue Type

Feature Request

OS

Linux

onnx2tf version number

1.25.7

onnx version number

1.16.1

onnxruntime version number

1.18.1

onnxsim (onnx_simplifier) version number

0.4.33

tensorflow version number

2.17.0

Download URL for ONNX

ask for high accuracy and high performance.

Parameter Replacement JSON

no

Description

Purpose:support precision quantization For some networks with high accuracy requirements and high performance, mixed precision quantization can be implemented?

github-actions[bot] commented 3 months ago

If there is no activity within the next two days, this issue will be closed automatically.