[InstCombine] Fold `fcmp pred (x - y), 0` into `fcmp pred x, y`

SahilPatidar commented 7 months ago

Resolve #85245 Alive2: https://alive2.llvm.org/ce/z/F4rDwK

SahilPatidar commented 7 months ago

@dtcxzyw, please check this out before making it ready for review.

SahilPatidar commented 7 months ago

I finished some dirty work. Does it look alright?

SahilPatidar commented 6 months ago

@jayfoad, how can I check for denormals being flushed?

jayfoad commented 6 months ago

@jayfoad, how can I check for denormals being flushed?

See earlier comments. @arsenm said:

F.getDenormalType(fltSemantics) == DenormalMode::getIEEE() works, but is pretty ugly

SahilPatidar commented 6 months ago

@jayfoad, I was looking for F.getDenormalType but couldn't find it. Instead, I tried using F->getDenormalMode(LHSI->getType()->getScalarType()->getFltSemantics()) == DenormalMode::getIEEE(), but I haven't seen any difference in the tests after making the change.

jayfoad commented 6 months ago

@jayfoad, I was looking for F.getDenormalType but couldn't find it. Instead, I tried using F->getDenormalMode(LHSI->getType()->getScalarType()->getFltSemantics()) == DenormalMode::getIEEE(), but I haven't seen any difference in the tests after making the change.

You would need to take a test where your optimization does apply and then make a copy of it with an attribute like this, and show that the optimization does not apply to it:

define float @foo(...) "denormal-fp-math"="dynamic,dynamic" {
  ...
}

See the LangRef for the allowed values of the denormal-fp-math attribute. The default is ieee,ieee and if you specify anything else it should block your optimization.

SahilPatidar commented 6 months ago

@jayfoad, I tried setting the denormal-fp-math attribute to various options (positive-zero, dynamic,dynamic, and preserve-sign) as you suggested. it seems to be preventing optimizations as you mentioned.

arsenm commented 6 months ago

@jayfoad, I tried setting the denormal-fp-math attribute to various options (positive-zero, dynamic,dynamic, and preserve-sign) as you suggested. it seems to be preventing optimizations as you mentioned.

Need to add these in some test functions

SahilPatidar commented 5 months ago

@arsenm Is there any further comment?

jayfoad commented 5 months ago

LGTM. I see "This pull request is still a work in progress Draft pull requests cannot be merged."

arsenm commented 5 months ago

LGTM with some nits, also has some conflicts. Should be converted to a not-draft PR

llvmbot commented 5 months ago

@llvm/pr-subscribers-llvm-transforms

Author: None (SahilPatidar)

Changes

Resolve #85245 Alive2: https://alive2.llvm.org/ce/z/azHAHq --- Patch is 20.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/85506.diff 2 Files Affected: - (modified) llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp (+38) - (modified) llvm/test/Transforms/InstCombine/fcmp.ll (+383-14) ``````````diff diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp index 7092fb5e509bb..baf8c35f94fba 100644 --- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp +++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp @@ -8037,6 +8037,44 @@ Instruction *InstCombinerImpl::visitFCmpInst(FCmpInst &I) { if (Instruction *NV = FoldOpIntoSelect(I, cast(LHSI))) return NV; break; + case Instruction::FSub: + switch (Pred) { + default: + break; + case FCmpInst::FCMP_UGT: + case FCmpInst::FCMP_ULT: + case FCmpInst::FCMP_UNE: + case FCmpInst::FCMP_OEQ: + case FCmpInst::FCMP_OGE: + case FCmpInst::FCMP_OLE: + // fsub x, y --> isnnan(x, y) && isninf(x, y) + if (!LHSI->hasNoNaNs() && !LHSI->hasNoInfs() && + !isKnownNeverInfinity(LHSI->getOperand(1), /*Depth=*/0, + getSimplifyQuery().getWithInstruction(&I)) && + !isKnownNeverInfinity(LHSI->getOperand(0), /*Depth=*/0, + getSimplifyQuery().getWithInstruction(&I))) + break; + + [[fallthrough]]; + case FCmpInst::FCMP_OGT: + case FCmpInst::FCMP_OLT: + case FCmpInst::FCMP_ONE: + case FCmpInst::FCMP_UEQ: + case FCmpInst::FCMP_UGE: + case FCmpInst::FCMP_ULE: + // fcmp pred (x - y), 0 --> fcmp pred x, y + if (match(RHSC, m_AnyZeroFP()) && + match(LHSI, m_FSub(m_Value(X), m_Value(Y))) && + I.getFunction()->getDenormalMode( + LHSI->getType()->getScalarType()->getFltSemantics()) == + DenormalMode::getIEEE()) { + replaceOperand(I, 0, X); + replaceOperand(I, 1, Y); + return &I; + } + break; + } + break; case Instruction::PHI: if (Instruction *NV = foldOpIntoPhi(I, cast(LHSI))) return NV; diff --git a/llvm/test/Transforms/InstCombine/fcmp.ll b/llvm/test/Transforms/InstCombine/fcmp.ll index 4d907800219d6..1f7b3fca0b00e 100644 --- a/llvm/test/Transforms/InstCombine/fcmp.ll +++ b/llvm/test/Transforms/InstCombine/fcmp.ll @@ -1289,7 +1289,7 @@ define <1 x i1> @bitcast_1vec_eq0(i32 %x) { define i1 @fcmp_fadd_zero_ugt(float %x, float %y) { ; CHECK-LABEL: @fcmp_fadd_zero_ugt( -; CHECK-NEXT: [[CMP:%.*]] = fcmp ugt float [[ADD:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp ugt float [[X:%.*]], [[Y:%.*]] ; CHECK-NEXT: ret i1 [[CMP]] ; %add = fadd float %x, 0.000000e+00 @@ -1299,7 +1299,7 @@ define i1 @fcmp_fadd_zero_ugt(float %x, float %y) { define i1 @fcmp_fadd_zero_uge(float %x, float %y) { ; CHECK-LABEL: @fcmp_fadd_zero_uge( -; CHECK-NEXT: [[CMP:%.*]] = fcmp uge float [[ADD:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp uge float [[X:%.*]], [[Y:%.*]] ; CHECK-NEXT: ret i1 [[CMP]] ; %add = fadd float %x, 0.000000e+00 @@ -1309,7 +1309,7 @@ define i1 @fcmp_fadd_zero_uge(float %x, float %y) { define i1 @fcmp_fadd_zero_ogt(float %x, float %y) { ; CHECK-LABEL: @fcmp_fadd_zero_ogt( -; CHECK-NEXT: [[CMP:%.*]] = fcmp ogt float [[ADD:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp ogt float [[X:%.*]], [[Y:%.*]] ; CHECK-NEXT: ret i1 [[CMP]] ; %add = fadd float %x, 0.000000e+00 @@ -1319,7 +1319,7 @@ define i1 @fcmp_fadd_zero_ogt(float %x, float %y) { define i1 @fcmp_fadd_zero_oge(float %x, float %y) { ; CHECK-LABEL: @fcmp_fadd_zero_oge( -; CHECK-NEXT: [[CMP:%.*]] = fcmp oge float [[ADD:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp oge float [[X:%.*]], [[Y:%.*]] ; CHECK-NEXT: ret i1 [[CMP]] ; %add = fadd float %x, 0.000000e+00 @@ -1329,7 +1329,7 @@ define i1 @fcmp_fadd_zero_oge(float %x, float %y) { define i1 @fcmp_fadd_zero_ult(float %x, float %y) { ; CHECK-LABEL: @fcmp_fadd_zero_ult( -; CHECK-NEXT: [[CMP:%.*]] = fcmp ult float [[ADD:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp ult float [[X:%.*]], [[Y:%.*]] ; CHECK-NEXT: ret i1 [[CMP]] ; %add = fadd float %x, 0.000000e+00 @@ -1339,7 +1339,7 @@ define i1 @fcmp_fadd_zero_ult(float %x, float %y) { define i1 @fcmp_fadd_zero_ule(float %x, float %y) { ; CHECK-LABEL: @fcmp_fadd_zero_ule( -; CHECK-NEXT: [[CMP:%.*]] = fcmp ule float [[ADD:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp ule float [[X:%.*]], [[Y:%.*]] ; CHECK-NEXT: ret i1 [[CMP]] ; %add = fadd float %x, 0.000000e+00 @@ -1349,7 +1349,7 @@ define i1 @fcmp_fadd_zero_ule(float %x, float %y) { define i1 @fcmp_fadd_zero_olt(float %x, float %y) { ; CHECK-LABEL: @fcmp_fadd_zero_olt( -; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[ADD:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], [[Y:%.*]] ; CHECK-NEXT: ret i1 [[CMP]] ; %add = fadd float %x, 0.000000e+00 @@ -1359,7 +1359,7 @@ define i1 @fcmp_fadd_zero_olt(float %x, float %y) { define i1 @fcmp_fadd_zero_ole(float %x, float %y) { ; CHECK-LABEL: @fcmp_fadd_zero_ole( -; CHECK-NEXT: [[CMP:%.*]] = fcmp ole float [[ADD:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp ole float [[X:%.*]], [[Y:%.*]] ; CHECK-NEXT: ret i1 [[CMP]] ; %add = fadd float %x, 0.000000e+00 @@ -1369,7 +1369,7 @@ define i1 @fcmp_fadd_zero_ole(float %x, float %y) { define i1 @fcmp_fadd_zero_oeq(float %x, float %y) { ; CHECK-LABEL: @fcmp_fadd_zero_oeq( -; CHECK-NEXT: [[CMP:%.*]] = fcmp oeq float [[ADD:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp oeq float [[X:%.*]], [[Y:%.*]] ; CHECK-NEXT: ret i1 [[CMP]] ; %add = fadd float %x, 0.000000e+00 @@ -1379,7 +1379,7 @@ define i1 @fcmp_fadd_zero_oeq(float %x, float %y) { define i1 @fcmp_fadd_zero_one(float %x, float %y) { ; CHECK-LABEL: @fcmp_fadd_zero_one( -; CHECK-NEXT: [[CMP:%.*]] = fcmp one float [[ADD:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp one float [[X:%.*]], [[Y:%.*]] ; CHECK-NEXT: ret i1 [[CMP]] ; %add = fadd float %x, 0.000000e+00 @@ -1389,7 +1389,7 @@ define i1 @fcmp_fadd_zero_one(float %x, float %y) { define i1 @fcmp_fadd_zero_ueq(float %x, float %y) { ; CHECK-LABEL: @fcmp_fadd_zero_ueq( -; CHECK-NEXT: [[CMP:%.*]] = fcmp ueq float [[ADD:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp ueq float [[X:%.*]], [[Y:%.*]] ; CHECK-NEXT: ret i1 [[CMP]] ; %add = fadd float %x, 0.000000e+00 @@ -1399,7 +1399,7 @@ define i1 @fcmp_fadd_zero_ueq(float %x, float %y) { define i1 @fcmp_fadd_zero_une(float %x, float %y) { ; CHECK-LABEL: @fcmp_fadd_zero_une( -; CHECK-NEXT: [[CMP:%.*]] = fcmp une float [[ADD:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp une float [[X:%.*]], [[Y:%.*]] ; CHECK-NEXT: ret i1 [[CMP]] ; %add = fadd float %x, 0.000000e+00 @@ -1409,7 +1409,7 @@ define i1 @fcmp_fadd_zero_une(float %x, float %y) { define i1 @fcmp_fadd_zero_ord(float %x, float %y) { ; CHECK-LABEL: @fcmp_fadd_zero_ord( -; CHECK-NEXT: [[CMP:%.*]] = fcmp ord float [[ADD:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp ord float [[X:%.*]], [[Y:%.*]] ; CHECK-NEXT: ret i1 [[CMP]] ; %add = fadd float %x, 0.000000e+00 @@ -1419,7 +1419,7 @@ define i1 @fcmp_fadd_zero_ord(float %x, float %y) { define i1 @fcmp_fadd_zero_uno(float %x, float %y) { ; CHECK-LABEL: @fcmp_fadd_zero_uno( -; CHECK-NEXT: [[CMP:%.*]] = fcmp uno float [[ADD:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp uno float [[X:%.*]], [[Y:%.*]] ; CHECK-NEXT: ret i1 [[CMP]] ; %add = fadd float %x, 0.000000e+00 @@ -1718,3 +1718,372 @@ define <2 x i1> @fcmp_une_sel_x_negx_with_any_fpzero_nnan_vec(<2 x i1> %cond, <2 %icmp = fcmp nnan une <2 x float> %sel, ret <2 x i1> %icmp } + +define i1 @fcmp_oeq_fsub_const(float %x, float %y) { +; CHECK-LABEL: @fcmp_oeq_fsub_const( +; CHECK-NEXT: [[FS:%.*]] = fsub float [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp oeq float [[FS]], 0.000000e+00 +; CHECK-NEXT: ret i1 [[CMP]] +; + %fs = fsub float %x, %y + %cmp = fcmp oeq float %fs, 0.000000e+00 + ret i1 %cmp +} + +define i1 @fcmp_oge_fsub_const(float %x, float %y) { +; CHECK-LABEL: @fcmp_oge_fsub_const( +; CHECK-NEXT: [[FS:%.*]] = fsub float [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp oge float [[FS]], 0.000000e+00 +; CHECK-NEXT: ret i1 [[CMP]] +; + %fs = fsub float %x, %y + %cmp = fcmp oge float %fs, 0.000000e+00 + ret i1 %cmp +} + +define i1 @fcmp_ole_fsub_const(float %x, float %y) { +; CHECK-LABEL: @fcmp_ole_fsub_const( +; CHECK-NEXT: [[FS:%.*]] = fsub float [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp ole float [[FS]], 0.000000e+00 +; CHECK-NEXT: ret i1 [[CMP]] +; + %fs = fsub float %x, %y + %cmp = fcmp ole float %fs, 0.000000e+00 + ret i1 %cmp +} + +define i1 @fcmp_ueq_fsub_const(float %x, float %y) { +; CHECK-LABEL: @fcmp_ueq_fsub_const( +; CHECK-NEXT: [[CMP:%.*]] = fcmp ueq float [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret i1 [[CMP]] +; + %fs = fsub float %x, %y + %cmp = fcmp ueq float %fs, 0.000000e+00 + ret i1 %cmp +} + +define i1 @fcmp_uge_fsub_const(float %x, float %y) { +; CHECK-LABEL: @fcmp_uge_fsub_const( +; CHECK-NEXT: [[CMP:%.*]] = fcmp uge float [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret i1 [[CMP]] +; + %fs = fsub float %x, %y + %cmp = fcmp uge float %fs, 0.000000e+00 + ret i1 %cmp +} + +define i1 @fcmp_ule_fsub_const(float %x, float %y) { +; CHECK-LABEL: @fcmp_ule_fsub_const( +; CHECK-NEXT: [[CMP:%.*]] = fcmp ule float [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret i1 [[CMP]] +; + %fs = fsub float %x, %y + %cmp = fcmp ule float %fs, 0.000000e+00 + ret i1 %cmp +} + +define i1 @fcmp_ugt_fsub_const(float %x, float %y) { +; CHECK-LABEL: @fcmp_ugt_fsub_const( +; CHECK-NEXT: [[FS:%.*]] = fsub float [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp ugt float [[FS]], 0.000000e+00 +; CHECK-NEXT: ret i1 [[CMP]] +; + %fs = fsub float %x, %y + %cmp = fcmp ugt float %fs, 0.000000e+00 + ret i1 %cmp +} + +define i1 @fcmp_ult_fsub_const(float %x, float %y) { +; CHECK-LABEL: @fcmp_ult_fsub_const( +; CHECK-NEXT: [[FS:%.*]] = fsub float [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp ult float [[FS]], 0.000000e+00 +; CHECK-NEXT: ret i1 [[CMP]] +; + %fs = fsub float %x, %y + %cmp = fcmp ult float %fs, 0.000000e+00 + ret i1 %cmp +} + +define i1 @fcmp_une_fsub_const(float %x, float %y) { +; CHECK-LABEL: @fcmp_une_fsub_const( +; CHECK-NEXT: [[FS:%.*]] = fsub float [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp une float [[FS]], 0.000000e+00 +; CHECK-NEXT: ret i1 [[CMP]] +; + %fs = fsub float %x, %y + %cmp = fcmp une float %fs, 0.000000e+00 + ret i1 %cmp +} + +define <8 x i1> @fcmp_uge_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_uge_fsub_const_ninf_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf uge <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub ninf <8 x float> %x, %y + %cmp = fcmp ninf uge <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_ule_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_ule_fsub_const_ninf_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf ule <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub ninf <8 x float> %x, %y + %cmp = fcmp ninf ule <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_ueq_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_ueq_fsub_const_ninf_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf ueq <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub ninf <8 x float> %x, %y + %cmp = fcmp ninf ueq <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_oge_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_oge_fsub_const_ninf_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf oge <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub ninf <8 x float> %x, %y + %cmp = fcmp ninf oge <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_ole_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_ole_fsub_const_ninf_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf ole <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub ninf <8 x float> %x, %y + %cmp = fcmp ninf ole <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_oeq_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_oeq_fsub_const_ninf_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf oeq <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub ninf <8 x float> %x, %y + %cmp = fcmp ninf oeq <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_ogt_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_ogt_fsub_const_ninf_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf ogt <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub ninf <8 x float> %x, %y + %cmp = fcmp ninf ogt <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_olt_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_olt_fsub_const_ninf_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf olt <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub ninf <8 x float> %x, %y + %cmp = fcmp ninf olt <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_one_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_one_fsub_const_ninf_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf one <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub ninf <8 x float> %x, %y + %cmp = fcmp ninf one <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_ugt_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_ugt_fsub_const_ninf_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf ugt <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub ninf <8 x float> %x, %y + %cmp = fcmp ninf ugt <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_ult_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_ult_fsub_const_ninf_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf ult <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub ninf <8 x float> %x, %y + %cmp = fcmp ninf ult <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_une_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_une_fsub_const_ninf_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf une <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub ninf <8 x float> %x, %y + %cmp = fcmp ninf une <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_uge_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_uge_fsub_const_nnan_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan uge <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub nnan <8 x float> %x, %y + %cmp = fcmp nnan uge <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_ule_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_ule_fsub_const_nnan_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan ule <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub nnan <8 x float> %x, %y + %cmp = fcmp nnan ule <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_ueq_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_ueq_fsub_const_nnan_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan ueq <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub nnan <8 x float> %x, %y + %cmp = fcmp nnan ueq <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_oge_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_oge_fsub_const_nnan_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan oge <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub nnan <8 x float> %x, %y + %cmp = fcmp nnan oge <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_ole_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_ole_fsub_const_nnan_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan ole <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub nnan <8 x float> %x, %y + %cmp = fcmp nnan ole <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_oeq_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_oeq_fsub_const_nnan_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan oeq <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub nnan <8 x float> %x, %y + %cmp = fcmp nnan oeq <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_ogt_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_ogt_fsub_const_nnan_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan ogt <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub nnan <8 x float> %x, %y + %cmp = fcmp nnan ogt <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_olt_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_olt_fsub_const_nnan_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan olt <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub nnan <8 x float> %x, %y + %cmp = fcmp nnan olt <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_one_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_one_fsub_const_nnan_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan one <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub nnan <8 x float> %x, %y + %cmp = fcmp nnan one <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_ugt_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_ugt_fsub_const_nnan_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan ugt <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub nnan <8 x float> %x, %y + %cmp = fcmp nnan ugt <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_ult_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_ult_fsub_const_nnan_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan ult <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub nnan <8 x float> %x, %y + %cmp = fcmp nnan ult <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_une_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) { +; CHECK-LABEL: @fcmp_une_fsub_const_nnan_vec( +; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan une <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub nnan <8 x float> %x, %y + %cmp = fcmp nnan une <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_ugt_fsub_const_vec_denormal_positive-zero(<8 x float> %x, <8 x float> %y) "denormal-fp-math"="positive-zero,positive-zero" { +; CHECK-LABEL: @fcmp_ugt_fsub_const_vec_denormal_positive-zero( +; CHECK-NEXT: [[FS:%.*]] = fsub <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp ogt <8 x float> [[FS]], zeroinitializer +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub <8 x float> %x, %y + %cmp = fcmp ogt <8 x float> %fs, zeroinitializer + ret <8 x i1> %cmp +} + +define <8 x i1> @fcmp_ogt_fsub_const_vec_denormal_dynamic(<8 x float> %x, <8 x float> %y) "denormal-fp-math"="dynamic,dynamic" { +; CHECK-LABEL: @fcmp_ogt_fsub_const_vec_denormal_dynamic( +; CHECK-NEXT: [[FS:%.*]] = fsub <8 x float> [[X:%.*]], [[Y:%.*]] +; CHECK-NEXT: [[CMP:%.*]] = fcmp ogt <8 x float> [[FS]], zeroinitializer +; CHECK-NEXT: ret <8 x i1> [[CMP]] +; + %fs = fsub <8 x float> %x, %y + %cmp = fcmp ogt <8 x... [truncated] ``````````

github-actions[bot] commented 5 months ago

:white_check_mark: With the latest revision this PR passed the C/C++ code formatter.

arsenm commented 5 months ago

Transforms/PhaseOrdering/X86/vector-reductions.ll fails precheck. It does have some fcmp + fsub patterns, so likely a real failure

SahilPatidar commented 5 months ago

After updating the test, it is breaking tail call fast <4 x float> @llvm.fabs.v4f32(<4 x float> [[TMP2]]).

define i32 @TestVectorsEqualFP(ptr noalias %Vec0, ptr noalias %Vec1, float %Tole
 ; CHECK-NEXT:    [[TMP0:%.*]] = load <4 x float>, ptr [[VEC0:%.*]], align 4
 ; CHECK-NEXT:    [[TMP1:%.*]] = load <4 x float>, ptr [[VEC1:%.*]], align 4
 ; CHECK-NEXT:    [[TMP2:%.*]] = fsub fast <4 x float> [[TMP0]], [[TMP1]]
-; CHECK-NEXT:    [[TMP3:%.*]] = tail call fast <4 x float> @llvm.fabs.v4f32(<4 x float> [[TMP2]])
-; CHECK-NEXT:    [[TMP4:%.*]] = tail call fast float @llvm.vector.reduce.fadd.v4f32(float -0.000000e+00, <4 x float> [[TMP3]])
-; CHECK-NEXT:    [[CMP4:%.*]] = fcmp fast ole float [[TMP4]], [[TOLERANCE:%.*]]
+; CHECK-NEXT:    [[TMP3:%.*]] = fcmp fast oge <4 x float> [[TMP0]], [[TMP1]]
+; CHECK-NEXT:    [[TMP4:%.*]] = fneg fast <4 x float> [[TMP2]]
+; CHECK-NEXT:    [[TMP5:%.*]] = select <4 x i1> [[TMP3]], <4 x float> [[TMP2]], <4 x float> [[TMP4]]
+; CHECK-NEXT:    [[TMP6:%.*]] = tail call fast float @llvm.vector.reduce.fadd.v4f32(float -0.000000e+00, <4 x float> [[TMP5]])
+; CHECK-NEXT:    [[CMP4:%.*]] = fcmp fast ole float [[TMP6]], [[TOLERANCE:%.*]]
 ; CHECK-NEXT:    [[COND5:%.*]] = zext i1 [[CMP4]] to i32
 ; CHECK-NEXT:    ret i32 [[COND5]]
 ;

arsenm commented 5 months ago

After updating the test, it is breaking tail call fast <4 x float> @llvm.fabs.v4f32(<4 x float> [[TMP2]]).

This regression looks like it's because you are missing a hasOneUse check on the fsub. You should add some multi-use negative tests

SahilPatidar commented 4 months ago

@jayfoad

dtcxzyw commented 4 months ago

@SahilPatidar Could you please update alive2 link in the PR description?

SahilPatidar commented 4 months ago

@SahilPatidar Could you please update alive2 link in the PR description?

Apologies for the delay; please take a look at the link.

llvm / llvm-project

[InstCombine] Fold `fcmp pred (x - y), 0` into `fcmp pred x, y` #85506