openxla / stablehlo

Backward compatible ML compute opset inspired by HLO/MHLO
Apache License 2.0
415 stars 113 forks source link

Spec for `reduce_precision` is inconsistent #1375

Open reedwm opened 1 year ago

reedwm commented 1 year ago

Request description

The spec for reduce_precision states:

Semantics

Performs element-wise conversion of operand to another floating-point type that uses exponent_bits and mantissa_bits and back to the original floating-point type and produces an output tensor.

More formally:

  • The mantissa bits of the original value are updated to round the original value to the nearest value representable with mantissa_bits using roundToIntegralTiesToEven semantics.
  • Then, if mantissa_bits are smaller than the number of mantissa bits of the original value, the mantissa bits are truncated to mantissa_bits.
  • Then, if the exponent bits of the intermediate result don't fit into the range provided by exponent_bits, the intermediate result overflows to infinity using the original sign or underflows to zero using the original sign.

The initial description of what occurs (converting to another floating-point type then converting back) is not equivalent to what is described in the "More formally" section (First the value is rounded based on reducing the mantissa bits, then overflow/underflow is checked based on the exponent bits). This is because rounding based on the mantissa_bits can introduce underflow to zero that can be prevented if the number of exponent bits is higher than in the input type.

For example, given the f8e5m2 input 2**-16 (represented with bits 0 00000 01), if reduce_precision is called with exponent_bits=6, mantissa_bits=1, the output should still be 2**-16, since a hypothetical IEEE-like f8e6m1 type has a high enough dynamic range to represent that number. However, the "More formally" section would first round based on the number of mantissa bits, which would round the number to 0. f8e5m2 is used in this example since the example is easier to understand with a narrow type, but the same issue applies to all floating-point types.

CC @ghpvnist. Since the semantics of convert state that "the behavior is TBD" for a float-to-float conversion, I would say similarly here since the semantics will be identical to the float-to-float case (at least when the destination float type has an Inf representation). Or simply remove the "More formally" part.

sdasgup3 commented 1 year ago

Thanks @reedwm for the issue. I am planning to fix this sometime in Q2'23. As per the offline follow up, it seems that the resolution can delay a bit. upd: un-assigning for now.