Open llvmbot opened 7 years ago
maybe we can just simply fixing this switch case into a increase expression. and also, that you write code __builtin_unreachable() will cause another problem that it will generate multiple exit counts which that will makes SCEVs not to analyze trip-count.
1.just by fixing set-expression as increase-expression. 2.remove __builtin_unreachable(); 3.create an AddRecExpr with switch instruction with a increase expression.
hope this will helped.
I have lots of works to do right now. maybe I can't write codes in time that you want. I am still progress in a writing partical-range based conditional-expression analyzer.
Gentle ping :)
@Jun Ryung Ju I am wondering if you've made any progress so far?
I will fix Unroll Loop part after fixing SCEV to support any conditional-expressions. that currently in progress.
http://lists.llvm.org/pipermail/llvm-dev/2017-November/119179.html
I can't work very fast D: my computer has broken, I only have my Surface Pro. =I need to compile codes without any compile acceleration, so it will take some times to fix it.
Not yet, in progress. I think this isn't only for a SCEV, even its predicatable, it unrolls without any conditional varaibles. that means it works wrong.
this will be a big changes in LoopUnroll and SCEV
Ping.
@Jun Ryung Ju First of all, thank you for taking a look at this :) Secondly, have you made any progress adding unrolling for non-natural loops?
same problem with ICmp instructions. LoopUnroll should support for conditional instructions to assume that loop can exit.
I think there is no general case for create AddExpr for "switch" instruction.
I don't understand why this loop has multiple exits.
Printing analysis 'Scalar Evolution Analysis' for function '?test@@YAXXZ':
Classifying expressions for: @"\01?test@@YAXXZ"
%step.0 = phi i2 [ 0, %entry ], [ %step.0.be, %while.cond.backedge ]
--> %step.0 U: full-set S: full-set Exits: <
I think something wrong with analyzing exits. while(true) isn't a exit actually.
let me debug for a while.
I'm trying to fix it.
Weekly ping :)
Weekly ping :)
Weekly ping :)
This might be important for the general performance of coroutines.
Ping. Has there been any progress about fixing this?
Yes, we already have code that does this in analyzeLoopUnrollCost in lib/Transforms/Scalar/LoopUnrollPass.cpp. I suspect that it just does not handle this case. Most probably, it bails out because the trip count of the loop is unknown, but it might be interesting to use it to evaluate tripcount for loops like this.
@Hal Finkel May I ask how would the symbolically executing part work? Does llvm have such "infrastructure"/code?
Yes, we already have code that does this in analyzeLoopUnrollCost in lib/Transforms/Scalar/LoopUnrollPass.cpp. I suspect that it just does not handle this case.
@Hal Finkel May I ask how would the symbolically executing part work? Does llvm have such "infrastructure"/code?
Thanks for acknowledging this issue. I've a little reproducer which doesn't depend on the couroutines feature:
Code:
extern void printf(const char*);
void test() { auto step = 0; while(true) { switch (step) { case 0: step = 1; break; case 1: printf("hello"); step = 2; break; case 2: printf("hello"); step = 3; break; case 3: return; default: __builtin_unreachable(); } } }
Surprisingly GCC is able to simplify this to two sequential calls to printf. However, I am lacking the skills of the internals of LLVM, so I would be grateful if anyone else could take a look at this :)
FWIW, if you try to force LLVM to unroll the loop by adding:
above it, it still won't. The reason given is, "Can't unroll; loop not terminated by a conditional branch."
I agree that we should do what GCC does and fully unroll this loop. I'm guessing that we need to symbolically evaluate some of it for a few iterations in order to see that unrolling is helpful.
Thanks for acknowledging this issue. I've a little reproducer which doesn't depend on the couroutines feature:
Code:
extern void printf(const char*);
void test() { auto step = 0; while(true) { switch (step) { case 0: step = 1; break; case 1: printf("hello"); step = 2; break; case 2: printf("hello"); step = 3; break; case 3: return; default: __builtin_unreachable(); } } }
Surprisingly GCC is able to simplify this to two sequential calls to printf. However, I am lacking the skills of the internals of LLVM, so I would be grateful if anyone else could take a look at this :)
Thank you for pinging. Unless we find someone willing (and having knowledge) to address the issue, I am not sure this will get fixed soon.
Ping.
Ping.
Ping.
Ping.
Extended Description
Test case: https://godbolt.org/g/esjnb2
I've also attached the test-case as an attachment.
LLVM-IR:
define void @printEquivalent()() local_unnamed_addr #0 !dbg !684 { %1 = tail call i32 (i8, ...) @printf(i8 getelementptr inbounds ([3 x i8], [3 x i8] @.str, i64 0, i64 0), i32 1), !dbg !685 %2 = tail call i32 (i8, ...) @printf(i8 getelementptr inbounds ([3 x i8], [3 x i8] @.str, i64 0, i64 0), i32 3), !dbg !686 ret void, !dbg !687 }
declare i32 @printf(i8* nocapture readonly, ...) local_unnamed_addr #1
define i32 @main() local_unnamed_addr #2 personality i8 bitcast (i32 (...) @__gxx_personality_v0 to i8*) !dbg !688 { br label %1, !dbg !693
;
;
;
;
;
Issue:
The optimizer could have optimized the couroutine into something more like the two sequential calls to printf. Instead, it isn't able to unroll the "couroutine-loop".