[Sprint 1] Add initial BranchPredictPass

sharaelong commented 1 year ago

Overview

Change branch direction to locate more probable block at false branch.

Implementation

I used LLVM intrinsic BranchProbabilityAnalysis in impl. By using getEdgeProbability in branch instructions, if true branch has probability more than 50%, than reverse direction. If branch condition is made by icmp instruction just before br, then reverse icmp condition, i.e. sgt -> sle. Else, I insert new instruction that reverse %cond register value.

Unit tests

Fibonacci number calculation with simple loops
Gugudan with double loops
Factorial calculation with recursive functions

willi19 commented 1 year ago

36~37th line of BranchPredictPass.cpp

// Negate the condition with an xor BI->setCondition(BinaryOperator::CreateNot(Cond, "", BI));

I wonder how this negate the condition with an xor.

Also I'm worried that by adding new binary operator to negate the condition might increase the cost. Like the case, using the branch with true condition only once.

Other than that, the code appears to be fine to me.

sharaelong commented 1 year ago

Yes, I noticed that when I was writing the code. Negation by xor is maybe the llvm's default optimization efforts. In our cases, it is more better to use subtraction like cond = 1-cond. I don't include that additional logic because I assume that most of i1 register value will be generated by icmp, but it's a fair question. I will revise it until the end of sprint 1.

germanium32 commented 1 year ago

Implementation logic looks good to me.

One caveat that we must handle might be the conflict between merging SimplifyCFGPass and BranchPredictionPass. Consider the case written in pseudo-llvm-IR.

entry:
  br A==1, bb1, bb2
bb1:
  do something1
  br exit
bb2:
  br A==2, bb3, bb4
bb3:
  do something2
  br exit
bb4:
  do something3
  br exit
exit:
  ret

Such type of code can be optimized significantly by using switch-case arguments. However, if we do the BranchPredictionPass first for some circumstances that swaps the instructions, the following phenomenon can happen.

entry:
  br A!=1, bb2, bb1
bb1:
  do something1
  br exit
bb2:
  br A==2, bb3, bb4
bb3:
  do something2
  br exit
bb4:
  do something3
  br exit
exit:
  ret

The SimplifyCFG checks equality comparison instructions like the form: "x==1, x==2, x==3, default" (cf. Line 224-238 of SimplifyCFG.cpp) however the transformed code will result in no optimizations.

In such case, SimplifyCFG -> BranchPredict will be optimal.

I can't think immediately now, but I think there might be other circumstances where SimplifyCFG -> BranchPredict is rather lethal. How should we order the passes?

germanium32 commented 1 year ago

Also, can you add some *.ll testcases, just for the consistency between other developments? In fact, the commotion uprised on this Tuesday about CI was mainly due to filechecks, but It seems that your commit doesn't raise filechecks.

Edit: Only commiting *.c files does not seem to run filechecks in CI. I think adding filecheck testcases is urgent.

sharaelong commented 1 year ago

Reply to @germanium32

Yes, order of pass applying is seemed to be important. In this specific case, SimplifyCFG has to be applied first. Also I think we have to make filecheck assume only each of pass exists with some default and basic pass. Benchmark repo's data responsible for 'Real world' test maybe.

I changed directory structure. My PR now follows convention of checkfile.sh. At first I think .c file is more good for pushing into repo, but everyone add .ll file. Default .gitignore suggests ignoring *.ll file, but two approaches are both valid... isn't it?

willi19 / swpp202301-compiler-team6