Open zccyman opened 3 months ago
@zccyman, you can specify the calibration method to CalibrationMethod.Distribution
@zccyman, you can specify the calibration method to CalibrationMethod.Distributionā³
It's right. but I find a bug of MaxPool during fp8 quantization? the following error what I get:
benchmarking quant model...
Traceback (most recent call last):
File "/home/developer/workspace/code/onnxruntime/examples/only_one_conv/run.py", line 110, in <module>
main()
File "/home/developer/workspace/code/onnxruntime/examples/only_one_conv/run.py", line 106, in main
benchmark(output_model_path)
File "/home/developer/workspace/code/onnxruntime/examples/only_one_conv/run.py", line 15, in benchmark
session = onnxruntime.InferenceSession(model_path)
File "/root/miniconda3/envs/ort/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/root/miniconda3/envs/ort/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 483, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidGraph: [ONNXRuntimeError] : 10 : INVALID_GRAPH : This is an invalid model. Type Error: Type 'tensor(float8e4m3fn)' of input parameter (input_QuantizeLinear_Output) of operator (MaxPool) in node (/MaxPool) is invalid.
Describe:
I set weight/activation with QuantType.QFLOAT8E4M3FN when calling quantize_static, but I get the following errors:
so I think FP8 calibration and quantization is not supported so far ?