openxla / stablehlo

Backward compatible ML compute opset inspired by HLO/MHLO
Apache License 2.0
416 stars 113 forks source link

Decide on mixed precision #369

Open sdasgup3 opened 2 years ago

sdasgup3 commented 2 years ago

Request description

Found a few disparities in inferring the return element type of operations involving reduction.

Lets first see the constraints we have on reducer function's type (from https://github.com/openxla/stablehlo/blob/8ec92006166d0602d0dca32e0267d169c2078e0d/stablehlo/dialect/TypeInference.cpp#L273 or https://github.com/tensorflow/tensorflow/blob/b362eaeb53cc21aa38f3eb1551500ced4fca0c97/tensorflow/compiler/xla/service/shape_inference.cc#L62, the former is inspired from the later)

Consider typical reduce-* op syntax:

op(I(i), V(j)):
  block(BI(i), BV(j)):
      ... some computation ...
  return(R(i))

I(i)  : i-th input of op
V(j)  : j-th init-value of op
BI(i) : i-th input of reducer-function
BV(j) : j-th init-value of reducer-function
R(i)  : i-th return-type

The constraints that we verify are:

  1. Check that BI(i) and R(i) have same shape and element-type.
  2. Check that BV(j) and R(i) have same shape and element-type, ignoring fp-precision.
  3. Check that V(j) and R(i) have same shape and element-type, ignoring fp-precision.
  4. Check that I(i) and BV(j) have same element-type, ignoring fp-precision.

Informally,
Any element in P = {BI[i], R[i]} have same element type. Any element in Q = {I(i), V(j), BV(j)} has same element type modulo fp precision. For a in P and b in Q, a and b has same element type modulo fp precision.

Now the return type of the followings ops are inferred as follows:

op Return element type inference
reduce Same as element type of R(i) hlo link, mhlo link
reduce_window Same as element type of V(i) hlo link, mhlo link
select_and_scatter Same as element type of I(i) hlo link, mhlo link

From 1-4, it seems like for all the ops. we can say the element type of return is "element type of I(i) ignoring fp precision ".

Should our spec use the same behavior as in HLO w.r.t ignoring precision?

sdasgup3 commented 2 years ago

Relevant discussion: https://github.com/openxla/stablehlo/pull/353#discussion_r1000066688

sdasgup3 commented 2 years ago

Here is the current resolution as per the above chat thread (in Eugene's words).

Since this is a cross-cutting concern, i.e. it applies to a bunch of ops - not just reduce and not limited to the ops that https://github.com/openxla/stablehlo/issues/369 is talking about - I propose that we don't address it in this PR, but instead write the spec as if mixed precision is not a thing and then address all the ops at once.

I will work on this in next week.

burmako commented 2 years ago

Found a neat list of HLO opcodes which allow mixed precision: https://github.com/tensorflow/tensorflow/blob/1d69ba72834b963b72075a82c10959f6bb74e473/tensorflow/compiler/xla/service/hlo_verifier.cc#L1681-L1714.

burmako commented 1 year ago

There have been recent conversations about StableHLO having different behavior from HLO as far as mixed precision is concerned (StableHLO doesn't support it, HLO does). Furthermore, this has come up in the context of quantization: https://github.com/openxla/stablehlo/pull/1477/files#r1199398839. Moving this to Frontend Contract and raising priority.