llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.14k stars 12.02k forks source link

"[CodeGen] convert math libcalls/builtins to equivalent LLVM intrinsics" change overriding "__asm__ specifier #35020

Open llvmbot opened 6 years ago

llvmbot commented 6 years ago
Bugzilla Link 35672
Version trunk
OS Linux
Attachments email conversion with change author
Reporter LLVM Bugzilla Contributor
CC @dendibakh,@efriedma-quic,@hfinkel,@RKSimon,@rotateright

Extended Description

There was a behavior change in how the math headers are processed following the checkin of "[CodeGen] convert math libcalls/builtins to equivalent LLVM intrinsics" on Dec 1, 2017 which looks like it may not have been intended. It appears the conversion to intrinsics is taking precedence over the processing of asm specifiers.

include

include

// With -ffast-math, header defines the following line: //extern double exp (double) asm ("" "__exp_finite") attribute ((nothrow ));

int main() { double i = 0.0; double r; for (i = 0.0; i < 1.0; i += 0.1) { r = exp(i); printf("%g: exp = %g\n", i, r); } return 0; }

With this code, when building with -ffast-math on Linux, the following used to create this for the 'exp' call: %9 = call fast double @​__exp_finite(double %8) #​3

After the commit, it is generating: %2 = call fast double @​llvm.exp.f64(double %1)

More detailed discussion is in the attachment.

dendibakh commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#36879

dendibakh commented 6 years ago

I haven't looked at LTO, but please do post any analysis that you've done. Preferably, open a new bug report for that, so we can track it separately from the remaining known issue here.

I opened new issue: llvm/llvm-bugzilla-archive#36879 I will post my observations in this tracker.

rotateright commented 6 years ago

I found that it only works for non-LTO builds. For LTO build issue still exist. ... I debugged this issue and found the root cause. I can share the details if someone is interested, however I don’t know how it can be fixed immediately.

I haven't looked at LTO, but please do post any analysis that you've done. Preferably, open a new bug report for that, so we can track it separately from the remaining known issue here.

dendibakh commented 6 years ago

I found that it only works for non-LTO builds. For LTO build issue still exist.

$ cat a.c

include

int main() { return exp(1.0); }

1) Non-LTO build (notice call to '__exp_finite'): $ clang -O0 -ffast-math a.c –lm $ objdump -d a.out | grep "call.*exp" 400567: e8 c4 fe ff ff callq 400430 __exp_finite@plt

2) LTO build (notice call to 'exp'): $ clang -O0 -flto -ffast-math a.c -lm $ objdump -d a.out | grep "call.*exp" 400547: e8 b4 fe ff ff callq 400400 exp@plt

We can see that if we add ‘-flto’ we stop generating the call to finite version of ‘exp’ function, although we could, because ‘-ffast-math’ is provided.

I debugged this issue and found the root cause. I can share the details if someone is interested, however I don’t know how it can be fixed immediately.

rotateright commented 6 years ago

For the record, Chris replied via email that we are getting the expected *_finite output now. But let's leave this open to track the clang bug that's still out there.

rotateright commented 6 years ago

Chris - can you verify that you get the expected end result in your real code after: https://reviews.llvm.org/rL322087 ?

There's still a clang bug here from what I can see:

include

// This is not the exp() you were looking for... extern double exp (double) asm ("" "explode") attribute ((nothrow ));

int main() { double i = 0.0; double r; for (i = 0.0; i < 1.0; i += 0.1) { r = exp(i); printf("%g: exp = %g\n", i, r); } return 0; }

rotateright commented 6 years ago
  1. At DAG legalization time. See SelectionDAGLegalize::ConvertNodeToLibcall(). Note that we've lost fast-math-flags on the DAG nodes at this point, so we should try to make progress on https://reviews.llvm.org/D37686 (not sure if anything is actually holding that up at this point).

I think (3) is right. I agree, we should make progress on the SDAG patch.

Here's a start on that fix by using the function attributes for now: https://reviews.llvm.org/D41338

Note: I won't be able to iterate on that patch for about a week at least. If we need this fixed immediately, please commandeer.

hfinkel commented 6 years ago

I've been looking at how the math intrinsics are handled in the backend, and it's not clear to me where we should do the translation to the 'finite' variants. This may also relate to the proposed fix for saving/restoring errno from D28335.

Potential options:

  1. In IR as part of a pre-DAG IR codegen pass. A candidate for extension/renaming is "partially-inline-libcalls".

  2. At DAG creation time in SelectionDAGBuilder::visitIntrinsicCall(). See for example the transforms that occur for a limited precision target such as in expandExp().

  3. At DAG legalization time. See SelectionDAGLegalize::ConvertNodeToLibcall(). Note that we've lost fast-math-flags on the DAG nodes at this point, so we should try to make progress on https://reviews.llvm.org/D37686 (not sure if anything is actually holding that up at this point).

I think (3) is right. I agree, we should make progress on the SDAG patch.

rotateright commented 6 years ago

I've been looking at how the math intrinsics are handled in the backend, and it's not clear to me where we should do the translation to the 'finite' variants. This may also relate to the proposed fix for saving/restoring errno from D28335.

Potential options:

  1. In IR as part of a pre-DAG IR codegen pass. A candidate for extension/renaming is "partially-inline-libcalls".

  2. At DAG creation time in SelectionDAGBuilder::visitIntrinsicCall(). See for example the transforms that occur for a limited precision target such as in expandExp().

  3. At DAG legalization time. See SelectionDAGLegalize::ConvertNodeToLibcall(). Note that we've lost fast-math-flags on the DAG nodes at this point, so we should try to make progress on https://reviews.llvm.org/D37686 (not sure if anything is actually holding that up at this point).

rotateright commented 6 years ago

We do manage to constant fold the whole thing at -O2...so at least that works...I hope. :)

rotateright commented 6 years ago

For reference, the commit was: https://reviews.llvm.org/rL319593

And that was part of a sequence of commits listed in: https://reviews.llvm.org/D28335 ...which describes the larger problem that I was trying to fix.

I thought we had some LibCallSimplifier support for this, so let me look at that first.

hfinkel commented 6 years ago

Hrmmm. Yes. Also, we should be generating calls to __exp_finite with a *-gnu triple for call fast double @​llvm.exp.f64. Both are true (i.e., Clang should check the asm name, but it should still generate the intrinsic in this case because LLVM should do the right thing in this case too).