Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

Disabling fp-contraction doesn't work. [#pragma clang fp contract(off)] #38652

Open Quuxplusone opened 5 years ago

Quuxplusone commented 5 years ago
Bugzilla Link PR39679
Status NEW
Importance P enhancement
Reported by Guille D. Canas (guille@berkeley.edu)
Reported on 2018-11-15 12:43:12 -0800
Last modified on 2021-10-26 08:54:36 -0700
Version trunk
Hardware PC All
CC andrew.kaylor@intel.com, blitzrakete@gmail.com, dgregor@apple.com, erik.pilkington@gmail.com, fj8765ah@aa.jp.fujitsu.com, hfinkel@anl.gov, llvm-bugs@lists.llvm.org, richard-llvm@metafoo.co.uk, ueno.masakazu@jp.fujitsu.com, yuanfang.chen@sony.com
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
Trying to prevent contracting (*,+) into an FMA operation (to guarantee
cross_product(a, a) == 0).
I add '#pragma clang fp contract(off)' to the start of a block, compiler
accepts the pragma, but still generates an FMA instruction.

Example at: godbolt.org/z/Cg8rYi

Thanks.
Quuxplusone commented 5 years ago

If someone can point me to the file(s) that handle this pragma I could try to see if I can find the bug. I'm not familiar with the code base but I can give it a shot.

Quuxplusone commented 5 years ago
I looked in the code. Looks like an easy to correct error.
File 'lib/CodeGen/BackendUtil.cpp:420' in the clang source seems to contain the
error:

[...]
  case LangOptions::FPC_Off:
    // Preserve any contraction performed by the front-end.  (Strict performs
    // splitting of the muladd instrinsic in the backend.)
    Options.AllowFPOpFusion = llvm::FPOpFusion::Standard; <-----
    break;
  case LangOptions::FPC_On:
    Options.AllowFPOpFusion = llvm::FPOpFusion::Standard;
    break;
  case LangOptions::FPC_Fast:
    Options.AllowFPOpFusion = llvm::FPOpFusion::Fast;
    break;
[...]

It seems line 420:
'llvm:FPOpFusion::Standard' should be -> 'llvm::FPOpFusion::Strict'.

According to the llvm source (include/llvm/CodeGen/CommandFlags.inc:180),
FPOpFusion::Strict corresponds to no fp-contraction)

If someone can verify, this should be a trivial fix.
Quuxplusone commented 5 years ago

Sadly the above modification (alone) did not fix the reported bug.

Quuxplusone commented 5 years ago
I think that there are the following two problems when judging whether to
perform FMA conversion.
(1) Even if there is pragma, it is judged by the translation option.
(2) There is an error in the process of setting pragma information.

For example, in the following program, I think it is correct that FMA
conversion is not performed.
Is this right?

$cat a.c
float foo(float b, float c, float d) {
#pragma clang fp contract(off)
   return b * c + d;
}

Assuming it is correct, I will attach my patch below.

------------------------------------------------------

diff --git a/clang/lib/CodeGen/CGExprScalar.cpp
b/clang/lib/CodeGen/CGExprScalar.cpp
index 3d082de..37ffc63 100644
--- a/clang/lib/CodeGen/CGExprScalar.cpp
+++ b/clang/lib/CodeGen/CGExprScalar.cpp
@@ -221,7 +221,7 @@ static Value *propagateFMFlags(Value *V, const BinOpInfo
&Op) {
   if (auto *I = dyn_cast<llvm::Instruction>(V)) {
     llvm::FastMathFlags FMF = I->getFastMathFlags();
     updateFastMathFlags(FMF, Op.FPFeatures);
-    I->setFastMathFlags(FMF);
+    I->copyFastMathFlags(FMF);
   }
   return V;
 }
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index b8af674..ffc7441 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -11258,6 +11258,11 @@ static bool isContractable(SDNode *N) {
   return F.hasAllowContract() || F.hasAllowReassociation();  }

+static bool isContractableOnly(SDNode *N) {
+  SDNodeFlags F = N->getFlags();
+  return F.hasAllowContract();
+}
+
 /// Try to perform FMA combining on a given FADD node.
 SDValue DAGCombiner::visitFADDForFMACombine(SDNode *N) {
   SDValue N0 = N->getOperand(0);
@@ -11287,6 +11292,9 @@ SDValue DAGCombiner::visitFADDForFMACombine(SDNode *N) {
   if (!AllowFusionGlobally && !isContractable(N))
     return SDValue();

+  if (!isContractableOnly(N))
+    return SDValue();
+
   const SelectionDAGTargetInfo *STI = DAG.getSubtarget().getSelectionDAGInfo();
   if (STI && STI->generateFMAsInMachineCombiner(OptLevel))
     return SDValue();
@@ -11499,6 +11507,9 @@ SDValue DAGCombiner::visitFSUBForFMACombine(SDNode *N) {
   if (!AllowFusionGlobally && !isContractable(N))
     return SDValue();

+  if (!isContractableOnly(N))
+    return SDValue();
+
   const SelectionDAGTargetInfo *STI = DAG.getSubtarget().getSelectionDAGInfo();
   if (STI && STI->generateFMAsInMachineCombiner(OptLevel))
     return SDValue();
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
index 1e089ff..5f5d371 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp
@@ -3466,9 +3466,7 @@ static bool isCombineInstrCandidateFP(const MachineInstr
&Inst) {
   case AArch64::FSUBv2f32:
   case AArch64::FSUBv2f64:
   case AArch64::FSUBv4f32:
-    TargetOptions Options = Inst.getParent()->getParent()->getTarget().Options;
-    return (Options.UnsafeFPMath ||
-            Options.AllowFPOpFusion == FPOpFusion::Fast);
+    return Inst.getFlag(MachineInstr::FmContract);
   }
   return false;
 }

------------------------------------------------------

The result of applying the above patch is shown below.

Specification option:
clang [-O1|-O2|-O3|-Ofast] (-ffp-contract={off|on|fast}) a.c

Before correction:
+--------+---------------------------------------------+
|        |               -ffp-contract=                |
+--------+---------------+---------+---------+---------+
|        | Not specified |   off   |    on   |   fast  |
+--------+---------------+---------+---------+---------+
| -O1    |    not fma    | not fma | not fma |   fma   |
+--------+---------------+---------+---------+---------+
| -O2    |    not fma    | not fma | not fma |   fma   |
+--------+---------------+---------+---------+---------+
| -O3    |    not fma    | not fma | not fma |   fma   |
+--------+---------------+---------+---------+---------+
| -Ofast |      fma      |   fma   |   fma   |   fma   |
+--------+---------------+---------+---------+---------+

After applying my patch:
+--------+---------------------------------------------+
|        |               -ffp-contract=                |
+--------+---------------+---------+---------+---------+
|        | Not specified |   off   |    on   |   fast  |
+--------+---------------+---------+---------+---------+
| -O1    |    not fma    | not fma | not fma | not fma |
+--------+---------------+---------+---------+---------+
| -O2    |    not fma    | not fma | not fma | not fma |
+--------+---------------+---------+---------+---------+
| -O3    |    not fma    | not fma | not fma | not fma |
+--------+---------------+---------+---------+---------+
| -Ofast |    not fma    | not fma | not fma | not fma |
+--------+---------------+---------+---------+---------+