intel / onnxruntime

ONNX Runtime: cross-platform, high performance scoring engine for ML models
MIT License
56 stars 22 forks source link

fix standalone dq conversion #387

Closed saurabhkale17 closed 2 months ago

saurabhkale17 commented 2 months ago

Description

This PR allows the retention of Dequantize layer if it feeds as an input to the 'Add' (supported) operator.

Motivation and Context

image The stand alone dequantize layer after gather op was striped out and replaced by the Identity node. The stand alone dequantize layer is having input as uint8 and output as float32 and this float32 output is input to the Add op.

Add op expects both inputs to be of same data type. When Identity op is inserted in place of dequantize layer the data type conversion has issues due to which the model PSQ1 is not functional.

By retaining the Dequantize layer, we ensure proper data type conversion, allowing both inputs to the Add operator to be float32 and thus resolving the issue.

Open issues fixed with the PR: https://jira.devtools.intel.com/browse/EISW-128302 https://jira.devtools.intel.com/browse/EISW-127924

sfatimar commented 2 months ago

LGTM