Open bharadwajymg opened 9 months ago
Hi, we are trying to quantise our onnx models to int8 to run on cpu using : https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html#quantization-on-gpu
we are using dynamic quantisation , and banking on AVX2 and AVX512 extensions , when we tested our models using onnx runtime we see an improvement so cross checking if this backend supports them by directly defining backend in config.pbtxt ?
yes, onxxruntime backend support INT8 on cpu
Hi, we are trying to quantise our onnx models to int8 to run on cpu using : https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html#quantization-on-gpu
we are using dynamic quantisation , and banking on AVX2 and AVX512 extensions , when we tested our models using onnx runtime we see an improvement so cross checking if this backend supports them by directly defining backend in config.pbtxt ?