willi19 / swpp202301-compiler-team6

MIT License
0 stars 0 forks source link

[Sprint 1] Add Load2Aload Pass #7

Closed goranmoomin closed 1 year ago

goranmoomin commented 1 year ago

Overview

This WIP PR adds an Load2AloadPass that converts existing load instructions into an aload_i* intrinsic, and pushes them up until a certain threshold.

Checklist

Below is a listup of issues that needs to be resolved at the time of the current PR.

goranmoomin commented 1 year ago

Current status on the miscompilation: I'm running the benchmark against a version of the pass that doesn't consider costs; it convert all loads into aloads without cost consideration, and the resulting program is failing to run on the interpreter (with an out-of-bounds error). Alive2 isn't helping, calling the transformations valid. I've tried removing aloads and just pushing loads upwards, and they're not working as well.

The miscompilation is found from the ricehub.c file; The runtime error occurs when the optimization pass pushes 10 loads upwards (pushing %14 = load i32, i32* @R, align 4 in front of: %i.1 = phi i32 [ 1, %if.end ], [ %inc60, %for.inc59 ]). 9 pushes are fine, though I'm not sure if the other 9 optimizations are correct.

The runtime error is happening in the .if.end47 basic block, and the optimization that causes the error is in the .for.cond21 block. The .for.cond21 block is the for loop condition: i <= R - mid, with the R access optimized. The .if.end47 block is the line hab += X[j] - X[x];, and I don't know yet what part is erroring out.

willi19 commented 1 year ago

I think checking whether this instruction is from swpp class, and get expected cost will be useful to others. So I think it would be better to make it is another file. Moreover, to decrease your code length, it will be benificial to add it with other pull request with short code. And further improve it on additional branch.

To do this, I think we should slightly fix the code. From line 58~60 of load2alod.cpp """ if (isa(I) || isa(I)) { // Note: terminator instructions don't need cost calculation in this case Cost = 0;
"""

Also for this pull request, I think Instead of switch-case, using vector to save instruction with the same cost and check whether opcode of given instruction is inside the vector will decrease the length of the code. I wonder you have some update of your current status. Otherwise we can look for it tommorow.

germanium32 commented 1 year ago

One idea to reduce PR line-diff; instead of using lengthy switch-case arguments, you may use some "set"-type data structure to simplify it. Similarly, you can simplify the lengthy if-else structures. (I think such simple code change will not hurt comilation time limits.)

Just a slight concern about ~200 line-diff issues. You may use the residues for better uses.

germanium32 commented 1 year ago

Your code seems to raise errors due to invalid aload pushes; which you may also know.

I may have understood your code wrongly, so if the following suggestion may be invalid. The implementation checks whether its valid using the "canStoreMemory()" function. I expect the logic to be similar to the code in the pass which I'm in charge of: "mem2reg.cpp > PromoteMemoryToRegister.cpp > isAllocaPromotable()".

Since the function checks if the memory uses only loads and stores, not other lethal instructions, you may consider the logic used there.

In fact, your function checks for StoreInst, CallBase, IntrinsicInst, but the isAllocaPromotable() function checks for LoadInst, StoreInst, isVolatile, InstrinsicInst, BitCastInst, GetElementPtrInst, AddrSpaceCast. Check some where you may have misconsidered. Since *.c -> *.ll generates some superfluous but toxic instructions, some problems may have occured here.

goranmoomin commented 1 year ago

Quick note: this probably have basic bugs. Seems like phi nodes have to be at the top of the BB.