Closed llvmbot closed 7 years ago
Sure, i can convert it to a statistic.
I still want to assert on the phi node case, because i suspect if that happens we may hit a correctness issue somewhere.
Yes, a statistic + assert is great to me.
Sure, i can convert it to a statistic.
I still want to assert on the phi node case, because i suspect if that happens we may hit a correctness issue somewhere.
Can we keep the diagnostic in (just to get an idea of how many times it fires?)
Yeah, it's slightly worse in general, even though it helps with this testcase.
So, the assert should only hurt us optimization wise, i believe -because we aren't using the most dominating leader, we may not always find the most congruences in some weird cases. But this is true in general anyway, because what is affected by conditionals and what way is a crapshoot, and you may not always choose the most optimal set of congruences.
If we change the leader when we hit this case, we will cost at least another iteration. Not sure if it's worth it.
The only case i'm really concerned for correctness about is phi node naming, so i'm going to restrict the assert to phi nodes, because i want to understand cases we have with phi nodes in the same situation, to make sure they are okay.
The RPO for the instructions is:
RPO visit order: br label %bb10 RPO visit order: br label %bb10 RPO visit order: br i1 undef, label %bb2, label %bb7 RPO visit order: %tmp6 = load %struct.foo*, %struct.foo* @global RPO visit order: %tmp8 = phi %struct.foo [ %tmp6, %bb2 ] RPO visit order: %tmp9 = getelementptr %struct.foo, %struct.foo %tmp8, i64 0, i32 1 RPO visit order: %tmp11 = load i32, i32 %tmp9 RPO visit order: %tmp3 = phi %struct.foo [ undef, %bb ], [ %tmp6, %bb2 ] RPO visit order: %tmp4 = getelementptr %struct.foo, %struct.foo %tmp3, i64 0, i32 1 RPO visit order: %tmp5 = load i32, i32 %tmp4 RPO visit order: br label %bb2 RPO visit order: %tmp = load %struct.foo, %struct.foo* @global RPO visit order: %tmp1 = getelementptr %struct.foo, %struct.foo %tmp
Accounting for the different blocks, it would require less iterations to resolve this case.
It definitely is a win for this case. However, interestingly, overall, it's a lose, which doesn't make sense to me (i don't think it should ever take more iterations). I"m investigating further to make sure i didn't screw it up.
This actually is a valid case of a more dominating result being discovered to be part of a congruence class with a less dominating leader, given our current iteration order.
This would cause one extra iteration to resolve.
Here, interestingly, it is possible to order the instructions to avoid the extra iteration by ordering them in RPO of the def-use graph.
Here, that would cause us to resolve tmp6 before tmp3, and get the right answer for tmp3 the first time.
I'm not sure if it is worth it, i'll check.
Slightly smaller: source_filename = "pr31682.ll"
%struct.quux = type { i32, i32, [2 x [4 x [6 x [6 x i16]]]] }
@global = external global %struct.quux*
define void @pluto() { bb: %tmp = load %struct.quux*, %struct.quux* @global %tmp1 = getelementptr %struct.quux, %struct.quux %tmp br label %bb2
bb2: ; preds = %bb2, %bb %tmp3 = phi %struct.quux [ undef, %bb ], [ %tmp6, %bb2 ] %tmp4 = getelementptr %struct.quux, %struct.quux %tmp3, i64 0, i32 1 %tmp5 = load i32, i32 %tmp4 %tmp6 = load %struct.quux, %struct.quux** @global br i1 undef, label %bb2, label %bb7
bb7: ; preds = %bb2 %tmp8 = phi %struct.quux [ %tmp6, %bb2 ] %tmp9 = getelementptr %struct.quux, %struct.quux %tmp8, i64 0, i32 1 br label %bb10
bb10: ; preds = %bb10, %bb7 %tmp11 = load i32, i32* %tmp9 br label %bb10 }
Extended Description
Reduced:
%struct.widget = type { i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, float, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i8, i8, i32, i32*, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [9 x [16 x [16 x i16]]], [5 x [16 x [16 x i16]]], [9 x [8 x [8 x i16]]], [2 x [4 x [16 x [16 x i16]]]], [16 x [16 x i16]], [16 x [16 x i32]], i32***, i32, i32, i32, i32, i32, %struct.baz, %struct.snork, %struct.eggs, i32, i32*, i32, i32, i32, i32, [4 x [4 x i32]], i32, i32, i32, i32, i32, double, i32, i32, i32, i32, i16**, i16**, i16**, i16**, [15 x i16], i32, i32, i32, i32, i32, i32, i32, i32, [6 x [32 x i32]], i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [1 x i32], i32, i32, [2 x i32], i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, %struct.wobble.1, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, double, double, i32, double, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, [3 x [2 x i32]], [2 x i32], i32, i32, i16, i32, i32, i32, i32, i32 } %struct.baz = type { i32, i32, [100 x %struct.snork], i32, float, float, float } %struct.snork = type { i32, i32, i32, i32, i32, i32, %struct.spam, %struct.wombat.0, %struct.wobble, i32, i32, i32, i32, i32, i32, i32, i32, i32 (i32), [3 x [2 x i32]] } %struct.spam = type { %struct.zot, %struct.wombat, %struct.wombat } %struct.zot = type { i32, i32, i8, i32, i32, i8, i8, i32, i32, i8, i32 } %struct.wombat = type { i32, i32, i32, i32, i32, i8, i32, i32, i32 } %struct.wombat.0 = type { [3 x [11 x %struct.quux]], [2 x [9 x %struct.quux]], [2 x [10 x %struct.quux]], [2 x [6 x %struct.quux]], [4 x %struct.quux], [4 x %struct.quux], [3 x %struct.quux] } %struct.quux = type { i16, i8, i64 } %struct.wobble = type { [2 x %struct.quux], [4 x %struct.quux], [3 x [4 x %struct.quux]], [10 x [4 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [5 x %struct.quux]], [10 x [5 x %struct.quux]], [10 x [15 x %struct.quux]], [10 x [15 x %struct.quux]] } %struct.eggs = type { i32, i32, i32, [2 x i32], i32, [8 x i32], %struct.eggs, %struct.eggs, i32, [2 x [4 x [4 x [2 x i32]]]], [16 x i8], [16 x i8], i32, i64, [4 x i32], [4 x i32], i64, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i16, double, i32, i32, i32, i32, i32, i32, i32, i32, i32 } %struct.wobble.1 = type { i32, i32, i32, i32, i32, %struct.wobble.1* }
@global = external local_unnamed_addr global %struct.widget*, align 8
define void @patatino() { bb: %tmp = load %struct.widget*, %struct.widget* @global, align 8 %tmp1 = getelementptr inbounds %struct.widget, %struct.widget %tmp, i64 0, i32 5 br label %bb2
bb2: ; preds = %bb2, %bb %tmp3 = phi %struct.widget [ undef, %bb ], [ %tmp6, %bb2 ] %tmp4 = getelementptr inbounds %struct.widget, %struct.widget %tmp3, i64 0, i32 39 %tmp5 = load i32, i32 %tmp4, align 8 %tmp6 = load %struct.widget, %struct.widget** @global, align 8 br i1 undef, label %bb2, label %bb7
bb7: ; preds = %bb2 %tmp8 = phi %struct.widget [ %tmp6, %bb2 ] %tmp9 = getelementptr inbounds %struct.widget, %struct.widget %tmp8, i64 0, i32 39 br label %bb10
bb10: ; preds = %bb10, %bb7 %tmp11 = load i32, i32* %tmp9, align 8 br label %bb10 }