Vectorize pass vectorizes consecutive loads or stores. This pass should come after LoopUnrollPass and SimplifyCFGPass (InstCombine should not come before Vectorize) to vectorize instructions inside loops. InstCombine pass should follow to remove not used instructions (which were used to calculate index).
Feature for part 2
Vectorize all possible loads by sinking load users
Load was not fully vectorized in Part 1. By sinking load users (in the same BasicBlock) with sinkAllLoadUsers(), it can now be fully vectorized. sinkAllLoadUsers() sinks all load users in post-order.
masking / reordering
Masking and reordering was not in plan, but after some tests it is found that they are needed. Masking is needed for binary_tree, which stores at 3 consecutive pointers. Reordering is needed for all loads/stores which are moved due to sinkAllLoadUsers().
Handle GEP with 3 operands
Added handling GEP with 3 operands. There is no GEP in benchmark that have more than 3 operands.
Fix not vectorizing in some case
If there is different load or store during vectorizing, it should be marked as next base instruction to be vectorized.
Not able to vectorize with int32 type yet
Testing after merging with Int32To64 pass showed that it cannot vectorize int32 types yet. It should be considered that Int32To64 pass creates some bitcast instructions to make i64* type, and mul instructions to make the offset doubled. getDifference() function should consider that cases to correctly calculate the difference. This feature can be added in the wrap-up period.
Vectorize pass
Vectorize pass vectorizes consecutive loads or stores. This pass should come after
LoopUnrollPass
andSimplifyCFGPass
(InstCombine
should not come before Vectorize) to vectorize instructions inside loops.InstCombine
pass should follow to remove not used instructions (which were used to calculate index).Feature for part 2
Vectorize all possible loads by sinking load users
Load was not fully vectorized in Part 1. By sinking load users (in the same BasicBlock) with
sinkAllLoadUsers()
, it can now be fully vectorized.sinkAllLoadUsers()
sinks all load users in post-order.masking / reordering
Masking and reordering was not in plan, but after some tests it is found that they are needed. Masking is needed for
binary_tree
, which stores at 3 consecutive pointers. Reordering is needed for all loads/stores which are moved due tosinkAllLoadUsers()
.Handle GEP with 3 operands
Added handling GEP with 3 operands. There is no GEP in benchmark that have more than 3 operands.
Fix not vectorizing in some case
If there is different load or store during vectorizing, it should be marked as next base instruction to be vectorized.
Not able to vectorize with int32 type yet
Testing after merging with
Int32To64
pass showed that it cannot vectorize int32 types yet. It should be considered thatInt32To64
pass creates somebitcast
instructions to makei64*
type, andmul
instructions to make the offset doubled.getDifference()
function should consider that cases to correctly calculate the difference. This feature can be added in the wrap-up period.