[Wrap-up] GEPUnpack pass

Benchmark Results (After Sprint 1)

benchmark	improved	benchmark	improved	benchmark	improved
binary_tree	7.23%	gcd	-6.57%	prime	23.91%
bitcount1	19.07%	jenkins_hash	21.14%	quick_sort	16.88%
bitcount2	-2.77%	matmul1	7.84%	radix_sort	9.26%
bitcount3	11.17%	matmul2	7.05%	rmq1d_naive	12.59%
bitcount4	33.09%	matmul3	12.19%	rmq1d_sparsetable	7.99%
bitcount5	2.52%	matmul4	2.55%	rmq2d_naive	13.12%
bubble_sort	14.79%	merge_sort	9.21%	rmq2d_sparsetable	13.26%
collatz	-0.08%

In matmul4, it is found that GEPUnpack is inefficient when it unpacks array pointer with offset 0. If we access A[x][0][1] declared as uint64_t A[2][2][2], this should be unpacked as A + x 256 + 64, but in GEPUnpack pass given by TAs just resolve this as A + x 256 + 0 128 + 1 64. With this constant propagation, the result is as follows.

benchmark	improved	benchmark	improved	benchmark	improved
_binarytree	*9.20%*	gcd	-6.57%	*prime*	*24.76%*
bitcount1	19.07%	jenkins_hash	21.14%	quick_sort	16.88%
bitcount2	-2.77%	matmul1	7.84%	_radixsort	*12.76%*
bitcount3	11.17%	matmul2	7.05%	_rmq1dnaive	*15.12%*
*bitcount4*	*43.26%*	matmul3	12.19%	rmq1d_sparsetable	7.98%
*bitcount5*	*27.08%*	*matmul4*	*11.31%*	_rmq2dnaive	*14.33%*
bubble_sort	14.79%	merge_sort	9.21%	rmq2d_sparsetable	13.26%
collatz	-0.08%

dongjinBaek / swpp202101-team2

[Wrap-up] GEPUnpack pass #22