Open llvmbot opened 6 years ago
mentioned in issue llvm/llvm-bugzilla-archive#36879
I haven't looked at LTO, but please do post any analysis that you've done. Preferably, open a new bug report for that, so we can track it separately from the remaining known issue here.
I opened new issue: llvm/llvm-bugzilla-archive#36879 I will post my observations in this tracker.
I found that it only works for non-LTO builds. For LTO build issue still exist. ... I debugged this issue and found the root cause. I can share the details if someone is interested, however I don’t know how it can be fixed immediately.
I haven't looked at LTO, but please do post any analysis that you've done. Preferably, open a new bug report for that, so we can track it separately from the remaining known issue here.
I found that it only works for non-LTO builds. For LTO build issue still exist.
$ cat a.c
int main() { return exp(1.0); }
1) Non-LTO build (notice call to '__exp_finite'): $ clang -O0 -ffast-math a.c –lm $ objdump -d a.out | grep "call.*exp" 400567: e8 c4 fe ff ff callq 400430 __exp_finite@plt
2) LTO build (notice call to 'exp'): $ clang -O0 -flto -ffast-math a.c -lm $ objdump -d a.out | grep "call.*exp" 400547: e8 b4 fe ff ff callq 400400 exp@plt
We can see that if we add ‘-flto’ we stop generating the call to finite version of ‘exp’ function, although we could, because ‘-ffast-math’ is provided.
I debugged this issue and found the root cause. I can share the details if someone is interested, however I don’t know how it can be fixed immediately.
For the record, Chris replied via email that we are getting the expected *_finite output now. But let's leave this open to track the clang bug that's still out there.
Chris - can you verify that you get the expected end result in your real code after: https://reviews.llvm.org/rL322087 ?
There's still a clang bug here from what I can see:
// This is not the exp() you were looking for... extern double exp (double) asm ("" "explode") attribute ((nothrow ));
int main() { double i = 0.0; double r; for (i = 0.0; i < 1.0; i += 0.1) { r = exp(i); printf("%g: exp = %g\n", i, r); } return 0; }
- At DAG legalization time. See SelectionDAGLegalize::ConvertNodeToLibcall(). Note that we've lost fast-math-flags on the DAG nodes at this point, so we should try to make progress on https://reviews.llvm.org/D37686 (not sure if anything is actually holding that up at this point).
I think (3) is right. I agree, we should make progress on the SDAG patch.
Here's a start on that fix by using the function attributes for now: https://reviews.llvm.org/D41338
Note: I won't be able to iterate on that patch for about a week at least. If we need this fixed immediately, please commandeer.
I've been looking at how the math intrinsics are handled in the backend, and it's not clear to me where we should do the translation to the 'finite' variants. This may also relate to the proposed fix for saving/restoring errno from D28335.
Potential options:
In IR as part of a pre-DAG IR codegen pass. A candidate for extension/renaming is "partially-inline-libcalls".
At DAG creation time in SelectionDAGBuilder::visitIntrinsicCall(). See for example the transforms that occur for a limited precision target such as in expandExp().
At DAG legalization time. See SelectionDAGLegalize::ConvertNodeToLibcall(). Note that we've lost fast-math-flags on the DAG nodes at this point, so we should try to make progress on https://reviews.llvm.org/D37686 (not sure if anything is actually holding that up at this point).
I think (3) is right. I agree, we should make progress on the SDAG patch.
I've been looking at how the math intrinsics are handled in the backend, and it's not clear to me where we should do the translation to the 'finite' variants. This may also relate to the proposed fix for saving/restoring errno from D28335.
Potential options:
In IR as part of a pre-DAG IR codegen pass. A candidate for extension/renaming is "partially-inline-libcalls".
At DAG creation time in SelectionDAGBuilder::visitIntrinsicCall(). See for example the transforms that occur for a limited precision target such as in expandExp().
At DAG legalization time. See SelectionDAGLegalize::ConvertNodeToLibcall(). Note that we've lost fast-math-flags on the DAG nodes at this point, so we should try to make progress on https://reviews.llvm.org/D37686 (not sure if anything is actually holding that up at this point).
We do manage to constant fold the whole thing at -O2...so at least that works...I hope. :)
For reference, the commit was: https://reviews.llvm.org/rL319593
And that was part of a sequence of commits listed in: https://reviews.llvm.org/D28335 ...which describes the larger problem that I was trying to fix.
I thought we had some LibCallSimplifier support for this, so let me look at that first.
Hrmmm. Yes. Also, we should be generating calls to __exp_finite with a *-gnu triple for call fast double @llvm.exp.f64. Both are true (i.e., Clang should check the asm name, but it should still generate the intrinsic in this case because LLVM should do the right thing in this case too).
Extended Description
There was a behavior change in how the math headers are processed following the checkin of "[CodeGen] convert math libcalls/builtins to equivalent LLVM intrinsics" on Dec 1, 2017 which looks like it may not have been intended. It appears the conversion to intrinsics is taking precedence over the processing of asm specifiers.
include
include
// With -ffast-math, header defines the following line: //extern double exp (double) asm ("" "__exp_finite") attribute ((nothrow ));
int main() { double i = 0.0; double r; for (i = 0.0; i < 1.0; i += 0.1) { r = exp(i); printf("%g: exp = %g\n", i, r); } return 0; }
With this code, when building with -ffast-math on Linux, the following used to create this for the 'exp' call: %9 = call fast double @__exp_finite(double %8) #3
After the commit, it is generating: %2 = call fast double @llvm.exp.f64(double %1)
More detailed discussion is in the attachment.