willi19 / swpp202301-compiler-team6

MIT License
0 stars 0 forks source link

Regarding some issues on Loop2Sum #42

Open germanium32 opened 1 year ago

germanium32 commented 1 year ago

During the offline meeting in 06/01, we led to a conclusion that Loop2Sum is overcomplex; we should divide the pass into some simple pieces. We propose a simpler pass structure: LoopVectorize + Add2Sum.

For the case LoopVectorize, we may choose to modify our implementation, or use a pre-existing pass. Most of the code implemented in in loop2sum can be reused in LoopVectorize, or we may use a slightly more delicate pre-existing pass to use: InnerLoopVectorizer.

For the case Add2Sum, the issue is fairly simple. Traverse over add instructions, and check whether if it is chained to a simple addition operation. That is:

res = a + b + c + d + e

will be transformed into ll by iterated additions, such as the form:

%0 = add %a, %b
%1 = add %0, %c
%2 = add %1, %d
%res = add %2, %e

which then can be optimized in the form:

%res = call sum(%a, %b, %c, %d, %e, 0, 0, 0)

Such pass is highly applicable; compared to Loop2Sum. We can optimize multiple additions without loop structure, and concatenating only 3 operands results in a cost equivalence, where 4 operands results in cost reduction.

If we vectorize our loop using LoopVectorize, then apply Add2Sum, it will give an identical result from our first objective. I will try to implement the more applicable version: Add2Sum, then try to modify LoopVectorize.