Open bongjunj opened 4 months ago
Target: i10 %val1 = #x000 (0) i10 %#0 = #x3ff (1023, -1) i10 %#1 = #x3fd (1021, -3)
IIRC %#1
should be 1023. It looks like a bug in alive2.
cc @nunoplopes @regehr @zhengyang92
Target: i10 %val1 = #x000 (0) i10 %#0 = #x3ff (1023, -1) i10 %#1 = #x3fd (1021, -3)
IIRC
%#1
should be 1023. It looks like a bug in alive2. cc @nunoplopes @regehr @zhengyang92
Discussed in https://github.com/AliveToolkit/alive2/issues/1056.
As far as I can tell, since it xor
two undef values and thus can produce -3 as a result.
Am I correct?
But the core transformation here is correct: https://alive2.llvm.org/ce/z/PwKRUa
define i10 @src(i10 %val1) {
entry:
%val2 = sub i10 0, %val1
%val3 = and i10 %val1, %val2
%val4 = add i10 %val3, -1
ret i10 %val4
}
define i10 @tgt(i10 %val1) {
entry:
%val2 = add i10 %val1, -1
%val3 = xor i10 %val1, -1
%val4 = and i10 %val2, %val3
ret i10 %val4
}
----------------------------------------
define i10 @src(i10 %val1) {
entry:
%val2 = sub i10 0, %val1
%val3 = and i10 %val1, %val2
%val4 = add i10 %val3, 1023
ret i10 %val4
}
=>
define i10 @tgt(i10 %val1) {
entry:
%val2 = add i10 %val1, 1023
%val3 = xor i10 %val1, 1023
%val4 = and i10 %val2, %val3
ret i10 %val4
}
Transformation seems to be correct!
If it is an LLVM bug, it is easy to fix this by adding an extra isGuaranteedNotToBeUndef
check. But I think many places in InstCombine don't handle undef
correctly since Alive2 doesn't complain about this without the urem
instruction.
cc @nikic @goldsteinn
It seems like there has to be some kind of alive2 bug here, it shouldn't report a miscompilation only with the urem present. @nunoplopes Could it be that alive2 only considers full undef values are function inputs, but not partial undef?
It seems like there has to be some kind of alive2 bug here, it shouldn't report a miscompilation only with the urem present. @nunoplopes Could it be that alive2 only considers full undef values are function inputs, but not partial undef?
That's correct. Alive2 only considers poison, fully undef, and regular inputs. We didn't enable partial undefs by default (it's implemented, but hidden)
With the partial undef mode turned on, Alive2 says:
----------------------------------------
define i10 @src(i10 %val1) {
entry:
%val2 = sub i10 0, %val1
%val3 = and i10 %val1, %val2
%val4 = add i10 %val3, 1023
ret i10 %val4
}
=>
define i10 @tgt(i10 %val1) {
entry:
%val2 = add i10 %val1, 1023
%val3 = xor i10 %val1, 1023
%val4 = and i10 %val2, %val3
ret i10 %val4
}
Transformation doesn't verify!
ERROR: Value mismatch
Example:
i10 %val1 = #x000 (0) [based on undef value]
Source:
i10 %val2 = #x000 (0)
i10 %val3 = #x000 (0)
i10 %val4 = #x3ff (1023, -1)
Target:
i10 %val2 = #x3ff (1023, -1)
i10 %val3 = #x3fd (1021, -3)
i10 %val4 = #x3fd (1021, -3)
Source value: #x3ff (1023, -1)
Target value: #x3fd (1021, -3)
FWIW, this is the trick to enable it:
diff --git a/ir/value.cpp b/ir/value.cpp
index a8f1fa94..c045aa91 100644
--- a/ir/value.cpp
+++ b/ir/value.cpp
@@ -282,7 +282,7 @@ void Input::merge(const ParamAttrs &other) {
expr Input::getUndefVar(const Type &ty, unsigned child) const {
string tyname = "isundef_" + getSMTName(child);
- //return expr::mkVar(tyname.c_str(), ty.getDummyValue(false).value);
+ return expr::mkVar(tyname.c_str(), ty.getDummyValue(false).value);
// FIXME: only whole value undef or non-undef for now
return expr::mkVar(tyname.c_str(), expr::mkUInt(0, 1));
}
diff --git a/tools/transform.cpp b/tools/transform.cpp
index aaa1bb9e..dac6c38a 100644
--- a/tools/transform.cpp
+++ b/tools/transform.cpp
@@ -330,7 +330,7 @@ static void instantiate_undef(const Input *in, map<expr, expr> &instances,
return;
auto var = in->getUndefVar(ty, child);
- if (!var.isValid())
+ if (1||!var.isValid())
return;
// TODO: add support for per-bit input undef
Thanks! Would it be viable to do a run of alive2 in partial undef mode over the test suite, or is it too slow? I'd like to have an idea of how common this issue is -- is this just an isolated case where partial undef is a problem but full undef isn't, or is this a widespread problem?
Thanks! Would it be viable to do a run of alive2 in partial undef mode over the test suite, or is it too slow? I'd like to have an idea of how common this issue is -- is this just an isolated case where partial undef is a problem but full undef isn't, or is this a widespread problem?
Yep, I can do that once our server vacates. It may take a day, but that's fine.
Took only 6.5 hours (instead of ~5h). The result is not pretty.. New failures:
+ LLVM :: Transforms/AggressiveInstCombine/lower-table-based-cttz-basics.ll
+ LLVM :: Transforms/AggressiveInstCombine/lower-table-based-cttz-zero-element.ll
+ LLVM :: Transforms/CodeGenPrepare/NVPTX/bypass-slow-div-constant-numerator.ll
+ LLVM :: Transforms/CorrelatedValuePropagation/phi-common-val.ll
+ LLVM :: Transforms/IndVarSimplify/AArch64/widen-loop-comp.ll
+ LLVM :: Transforms/IndVarSimplify/ARM/code-size.ll
+ LLVM :: Transforms/IndVarSimplify/exit_value_test3.ll
+ LLVM :: Transforms/IndVarSimplify/post-inc-range.ll
+ LLVM :: Transforms/IndVarSimplify/replace-loop-exit-folds.ll
+ LLVM :: Transforms/Inline/devirtualize.ll
+ LLVM :: Transforms/InstCombine/add-mask-neg.ll
+ LLVM :: Transforms/InstCombine/add.ll
+ LLVM :: Transforms/InstCombine/and-or-icmps.ll
+ LLVM :: Transforms/InstCombine/ashr-lshr.ll
+ LLVM :: Transforms/InstCombine/canonicalize-clamp-like-pattern-between-negative-and-positive-thresholds.ll
+ LLVM :: Transforms/InstCombine/canonicalize-clamp-like-pattern-between-zero-and-positive-threshold.ll
+ LLVM :: Transforms/InstCombine/ffs-i16.ll
+ LLVM :: Transforms/InstCombine/gep-inbounds-null.ll
+ LLVM :: Transforms/InstCombine/getelementptr.ll
+ LLVM :: Transforms/InstCombine/icmp-and-lowbit-mask.ll
+ LLVM :: Transforms/InstCombine/icmp-gep.ll
+ LLVM :: Transforms/InstCombine/minmax-of-xor-x.ll
+ LLVM :: Transforms/InstCombine/onehot_merge.ll
+ LLVM :: Transforms/InstCombine/partally-redundant-left-shift-input-masking-variant-e.ll
+ LLVM :: Transforms/InstCombine/redundant-left-shift-input-masking-after-truncation-variant-e.ll
+ LLVM :: Transforms/InstCombine/redundant-left-shift-input-masking-after-truncation-variant-f.ll
+ LLVM :: Transforms/InstCombine/redundant-left-shift-input-masking-variant-e.ll
+ LLVM :: Transforms/InstCombine/redundant-left-shift-input-masking-variant-f.ll
+ LLVM :: Transforms/InstCombine/rem.ll
+ LLVM :: Transforms/InstCombine/shift-by-signext.ll
+ LLVM :: Transforms/InstCombine/shift.ll
+ LLVM :: Transforms/InstCombine/sqrt.ll
+ LLVM :: Transforms/InstCombine/sub.ll
+ LLVM :: Transforms/InstCombine/switch-select.ll
+ LLVM :: Transforms/InstCombine/truncating-saturate.ll
+ LLVM :: Transforms/InstCombine/vector_insertelt_shuffle-inseltpoison.ll
+ LLVM :: Transforms/InstCombine/vector_insertelt_shuffle.ll
+ LLVM :: Transforms/InstSimplify/and-or-implied-cond.ll
+ LLVM :: Transforms/LoopFlatten/loop-flatten-version.ll
+ LLVM :: Transforms/LoopVectorize/first-order-recurrence.ll
+ LLVM :: Transforms/LoopVectorize/reduction-small-size.ll
+ LLVM :: Transforms/PhaseOrdering/X86/merge-functions.ll
+ LLVM :: Transforms/PhaseOrdering/two-shifts-by-sext.ll
+ LLVM :: Transforms/Scalarizer/variable-extractelement.ll
+ LLVM :: Transforms/Scalarizer/variable-insertelement.ll
+ LLVM :: Transforms/SimplifyCFG/ARM/switch-to-lookup-table.ll
+ LLVM :: Transforms/SimplifyCFG/nomerge.ll
+ LLVM :: Transforms/SimplifyCFG/preserve-branchweights-switch-create.ll
+ LLVM :: Transforms/SimplifyCFG/switch_create-custom-dl.ll
+ LLVM :: Transforms/SimplifyCFG/switch_create.ll
+ LLVM :: Transforms/StraightLineStrengthReduce/slsr-gep.ll
+ LLVM :: Transforms/VectorCombine/AArch64/load-extractelement-scalarization.ll
+ LLVM :: Transforms/VectorCombine/load-insert-store.ll
A few random examples:
; Transforms/InstCombine/shift.ll
define i32 @shl1_cttz(i32 %x) {
%tz = cttz i32 %x, 1
%shl = shl i32 1, %tz
ret i32 %shl
}
=>
define i32 @shl1_cttz(i32 %x) {
%neg = sub i32 0, %x
%shl = and i32 %neg, %x
ret i32 %shl
}
Transformation doesn't verify! (unsound)
ERROR: Target's return value is more undefined
Example:
i32 %x = #x00000006 (6) [based on undef value]
Source:
i32 %tz = #x00000001 (1)
i32 %shl = #x00000002 (2)
Target:
i32 %neg = #xfffffffa (4294967290, -6)
i32 %shl = #x0000000a (10)
Source value: #x00000002 (2)
Target value: #x0000000a (10)
; Transforms/InstCombine/sub.ll
define i8 @diff_of_squares(i8 %x, i8 %y) {
%x2 = mul i8 %x, %x
%y2 = mul i8 %y, %y
%r = sub i8 %x2, %y2
ret i8 %r
}
=>
define i8 @diff_of_squares(i8 %x, i8 %y) {
%add = add i8 %x, %y
%sub = sub i8 %x, %y
%r = mul i8 %add, %sub
ret i8 %r
}
Transformation doesn't verify! (unsound)
ERROR: Target's return value is more undefined
Example:
i8 %x = #x09 (9)
i8 %y = #x00 (0) [based on undef value]
Source:
i8 %x2 = #x51 (81)
i8 %y2 = #x00 (0)
i8 %r = #x51 (81)
Target:
i8 %add = #x19 (25)
i8 %sub = #x09 (9)
i8 %r = #xe1 (225, -31)
Source value: #x51 (81)
Target value: #xe1 (225, -31)
; Transforms/InstCombine/gep-inbounds-null.ll
define <2 x i1> @test_vector_base(<2 x ptr> %base, i64 %idx) {
%gep = gep inbounds <2 x ptr> %base, 1 x i64 %idx
%cnd = icmp eq <2 x ptr> %gep, { null, null }
ret <2 x i1> %cnd
}
=>
define <2 x i1> @test_vector_base(<2 x ptr> %base, i64 %idx) {
%cnd = icmp eq <2 x ptr> %base, { null, null }
ret <2 x i1> %cnd
}
Transformation doesn't verify! (unsound)
ERROR: Target's return value is more undefined
Example:
<2 x ptr> %base = < null, pointer(non-local, block_id=1, offset=0) [based on undef value] >
i64 %idx = #x0000000000000004 (4)
Source:
<2 x ptr> %gep = < poison, pointer(non-local, block_id=1, offset=4) >
<2 x i1> %cnd = < poison, #x0 (0) >
SOURCE MEMORY STATE
===================
NON-LOCAL BLOCKS:
Block 0 > size: 0 align: 1 alloc type: 0 alive: false address: 0
Block 1 > size: 8 align: 1 alloc type: 0 alive: true address: 1
Block 2 > size: 3 align: 4 alloc type: 0 alive: true address: 12
Target:
<2 x i1> %cnd = < #x1 (1), #x1 (1) >
Source value: < poison, #x0 (0) >
Target value: < #x1 (1), #x1 (1) >
// Transforms/InstCombine/getelementptr.ll -- seems like a bug in Alive2
I am not willing to fix these bugs since adding a bunch of isGuaranteedNotToBeUndef
checks may be expensive and a burden of maintenance. BTW, have we encountered these bugs in real-world cases?
@dtcxzyw We have encountered a few multi-use undef miscompiles related to select conditions in the wild. Not sure whether we hit any that did not involve selects.
@dtcxzyw We have encountered a few multi-use undef miscompiles related to select conditions in the wild. Not sure whether we hit any that did not involve selects.
IMO, as this is not a serious issue, we should spend more time removing the undef support. We have given up on correctness of handling undef in the backend.
IMO, as this is not a serious issue, we should spend more time removing the undef support. We have given up on correctness of handling undef in the backend.
Is there a roadmap to entirely drop undef
support?
IMO, as this is not a serious issue, we should spend more time removing the undef support. We have given up on correctness of handling undef in the backend.
Is there a roadmap to entirely drop
undef
support?
There isn't an official plan nor any concerted effort (but there could be). A few of us have been slowly doing bits and pieces. What we have done so far is:
What's left?
Add an auto-migration path to replace undef with zero
It looks ok as it even improves both performance and code-size :)
https://alive2.llvm.org/ce/z/Fv6zpv
https://github.com/llvm/llvm-project/blob/da0e5359fc1a5bf1749306440f9dad089046d772/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp#L1656-L1665
The transformation does not properly handle
undef
as a result ofurem i10 %val0, 42
.undef
propagates to intermediate instructions.