Closed dtcxzyw closed 3 months ago
Tests with non-constant input:
define i1 @src1(float nofpclass(zero nan pinf pnorm psub) %a) {
%x = call float @llvm.sqrt.f32(float %a)
%y = bitcast float %x to i32
%cmp = icmp slt i32 %y, 0
ret i1 %cmp
}
define i1 @tgt1(float %a) {
ret i1 true
}
define i1 @src2(float nofpclass(zero nan pinf pnorm psub) %a) {
%x = call float @llvm.sqrt.f32(float %a)
%y = bitcast float %x to i32
%cmp = icmp slt i32 %y, 0
ret i1 %cmp
}
define i1 @tgt2(float %a) {
ret i1 false
}
I think Alive2 is correct per the last consensus of NaN payload. When a value is NaN, the payload has a non-deterministic sign, which allows refinement to a positive or negative number. There was a long discussion here: https://discourse.llvm.org/t/stronger-floating-point-nan-guarantees/72165 Which ended up in this PR: https://github.com/llvm/llvm-project/pull/66579/files
cc @jyknight @RalfJung
cc @arsenm
But we cannot make assumptions about the sign bit of return values.
Indeed. Specifically, a programmer cannot make such assumptions. The compiler is allowed to make the sign go either way. (This explains why the programmer cannot make any assumptions.) And that's what alive shows.
This is one of these cases where it is crucial to think of compilation in terms of refinement (the final program has fewer behaviors than the original program), not equivalence.
In particular, there is no guarantee (and no intent to guarantee) that the sign bit matches "whatever the hardware does". Such a guarantee would be very tricky to realize, since it would mean that const-folding of float operations has to know exactly which NaN is produced by the hardware we are currently compiling for.
Indeed. Specifically, a programmer cannot make such assumptions. The compiler is allowed to make the sign go either way. (This explains why the programmer cannot make any assumptions.) And that's what alive shows.
This is one of these cases where it is crucial to think of compilation in terms of refinement (the final program has fewer behaviors than the original program), not equivalence.
If we allow LLVM to assume that return value of llvm.sqrt
is never negative (except for -0), we cannot refine a libcall sqrt
into an intrinsic call llvm.sqrt
. See the discussion in https://github.com/llvm/llvm-project/issues/92217#issuecomment-2111743008. @arsenm points out that we should follow the IEEE standard, which doesn't allow interpreting the signbit of a NaN.
BTW, I don't know why alive2 complains about the following refinement: https://alive2.llvm.org/ce/z/hHiosd
define i1 @src1(float nofpclass(zero norm inf sub) %x) {
%y = bitcast float %x to i32
%cmp = icmp slt i32 %y, 0
ret i1 %cmp
}
define i1 @tgt1(float %x) {
ret i1 true
}
----------------------------------------
define i1 @src1(float nofpclass(1020) %x) {
#0:
%y = bitcast float nofpclass(1020) %x to i32
%cmp = icmp slt i32 %y, 0
ret i1 %cmp
}
=>
define i1 @tgt1(float %x) {
#0:
ret i1 1
}
Transformation doesn't verify!
ERROR: Value mismatch
Example:
float nofpclass(1020) %x = #x7f800200 (SNaN)
Source:
i32 %y = #x7f800200 (2139095552)
i1 %cmp = #x0 (0)
Target:
Source value: #x0 (0)
Target value: #x1 (1)
In particular, there is no guarantee (and no intent to guarantee) that the sign bit matches "whatever the hardware does". Such a guarantee would be very tricky to realize, since it would mean that const-folding of float operations has to know exactly which NaN is produced by the hardware we are currently compiling for.
IIUC this constant folding is incorrect without nnan
.
If we allow LLVM to assume that return value of llvm.sqrt is never negative (except for -0),
We don't allow LLVM to assume that. We allow LLVM to decide over the sign of any NaN produced by basic float operations (of which sqrt is one). That's not the same thing.
Sometimes LLVM makes that decision via const folding, sometimes it doesn't care and leaves the decision to hardware.
If we allow LLVM to assume that return value of llvm.sqrt is never negative (except for -0),
We don't allow LLVM to assume that. We allow LLVM to decide over the sign of any NaN produced by basic float operations (of which sqrt is one). That's not the same thing.
Sometimes LLVM makes that decision via const folding, sometimes it doesn't care and leaves the decision to hardware.
I agree.
But the fundamental problem is how we interpret the semantic of libcall to sqrt
. Specifically, we use target-specific intrinsic _mm256_sqrt_ps
in https://github.com/llvm/llvm-project/issues/92217. If we allow this refinement, we cannot convert _mm256_sqrt_ps
into llvm.sqrt
.
AFAIK the vendor intrinsics for some other operations are also lowered to corresponding generic LLVM operations? Like basic per-lane arithmetic. So this is not specific to sqrt, the question is how closely the vendor intrinsics have to match the underlying hardware. Making them match 100% is not great for anyone who wants optimizations. After all, there always is the option of inline asm if you need an exact match with hardware semantics and are willing to pay the perf cost.
Vendor intrinsics will already not honor the FP environment; this is in the same broad class of questions. (And that's largely a clang question, not an alive/LLVM question, so this is the wrong issue tracker for that.)
Alive2: https://alive2.llvm.org/ce/z/g8marn
sqrt with a negative value (except for -0) returns nan. But we cannot make assumptions about the sign bit of return values.
See also https://github.com/llvm/llvm-project/issues/92217
cc @regehr @zhengyang92