microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
13.6k stars 2.77k forks source link

openvino with int8 #20072

Open oldma3095 opened 3 months ago

oldma3095 commented 3 months ago

Describe the issue

image How to use the int8 model. It's like CPU_FP8 or GPU_FP8?

To reproduce

openvino with int8 model

Urgency

No response

Platform

Linux

OS Version

ubuntu 20/ubuntu 22

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.17.1

ONNX Runtime API

C

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

sfatimar commented 3 months ago

Just USE CPU_FP32 or GPU_FP32 it will default to model and input type precision that is int8

sfatimar commented 3 months ago

We are also depracating CPU_FP32 in next release and will only have CPU, GPU with inference precision as provider option