openxla / stablehlo

Backward compatible ML compute opset inspired by HLO/MHLO
Apache License 2.0
390 stars 103 forks source link

Interpreter support for quantized type #2388

Closed sdasgup3 closed 3 months ago

sdasgup3 commented 3 months ago

fixes https://github.com/openxla/stablehlo/issues/2373

The PR is rebased on top on https://github.com/openxla/stablehlo/pull/2383 and cherry-pick changes from https://github.com/openxla/stablehlo/pull/2384.

Direction to reviewer

Please review the commit https://github.com/openxla/stablehlo/pull/2388/commits/4d7dc1ae715ba0f0fb3671404441df80902dadcd excluding the following files

sdasgup3 commented 3 months ago

I think this approach is fine, but I thought the point of the reference interpreter was to show how the various operations work. Would it be much more onerous to implement support for uniform_quantize and uniform_dequantize directly?

That is a very good point! Even if we support uniform_{de}quantize operations we still need support for other quantized operations, like add with quantized type, in the interpreter.

The idea here is to handle all the quantized operation uniformly. That means, either

(A) Support the execution semantics of all the quantized operations (uniform_quantize, uniform_dequantize, and any other operations which support quantized types) natively in the interpreter, OR (B) Apply the pass uniformly to lower any quantized operations.

(B) is the fastest path towards evaluating quantize program.