microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
https://onnxruntime.ai
MIT License
14.35k stars 2.88k forks source link

ORT returns incorrect result for UINT8 Matmul on specific CPU #19109

Open arui-yyz opened 9 months ago

arui-yyz commented 9 months ago

Describe the issue

CPU Onnxruntime returns incorrect result for UINT8 quantized model (contains just 1 matmul shape(1,4) @ shape(4,1)) with the following env: onnx==1.14 onnxruntime==1.16 protobuf==4.24.4

Passing on CPU: AMD Ryzen 9 7900X 12-Core Processor ; correct output is 0.22868575 Failing on CPU: AMD Ryzen Threadripper 2950X 16-Core Processor; incorrect output is -0.44277453

To reproduce

Onnx file: mm_no_bias_uint8.tar.gz

Script to repro:

import onnx
import onnxruntime as ort
import numpy as np

import argparse

# get input path from commandline arguments
parser = argparse.ArgumentParser()
parser.add_argument('--input_model_path', type=str, help='path to the input model')
args = parser.parse_args()

# first import the onnx model
input_model_path = args.input_model_path
ort_session = ort.InferenceSession(input_model_path, providers=["CPUExecutionProvider"])
input_name = ort_session.get_inputs()[0].name
input_shape = ort_session.get_inputs()[0].shape
input_data = np.array([[0.6541, 0.4707, 0.2821, 0.5569]], dtype=np.float32)
quantized_output = ort_session.run(None, {input_name: input_data})
print("output: ", quantized_output)

Urgency

Customer release is blocked by this issue.

Platform

Linux

OS Version

20.04.6 LTS (Focal Fossa)

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.16

ONNX Runtime API

Python

Architecture

X86

Execution Provider

Default CPU

Execution Provider Library Version

No response

github-actions[bot] commented 7 months ago

This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.