Vectorize pass

Vectorize pass vectorizes consecutive loads or stores. This pass should come after LoopUnrollPass and SimplifyCFGPass (InstCombine should not come before Vectorize) to vectorize instructions inside loops. InstCombine pass should follow to remove not used instructions (which were used to calculate index).

Feature for part 2

Vectorize all possible loads by sinking load users

Load was not fully vectorized in Part 1. By sinking load users (in the same BasicBlock) with sinkAllLoadUsers(), it can now be fully vectorized. sinkAllLoadUsers() sinks all load users in post-order.

masking / reordering

Masking and reordering was not in plan, but after some tests it is found that they are needed. Masking is needed for binary_tree, which stores at 3 consecutive pointers. Reordering is needed for all loads/stores which are moved due to sinkAllLoadUsers().

Handle GEP with 3 operands

Added handling GEP with 3 operands. There is no GEP in benchmark that have more than 3 operands.

Fix not vectorizing in some case

If there is different load or store during vectorizing, it should be marked as next base instruction to be vectorized.

Not able to vectorize with int32 type yet

Testing after merging with Int32To64 pass showed that it cannot vectorize int32 types yet. It should be considered that Int32To64 pass creates some bitcast instructions to make i64* type, and mul instructions to make the offset doubled. getDifference() function should consider that cases to correctly calculate the difference. This feature can be added in the wrap-up period.

dongjinBaek / swpp202101-team2

[Sprint 3] Vectorize Pass - Part 2 #32