In matmul4, it is found that GEPUnpack is inefficient when it unpacks array pointer with offset 0.
If we access A[x][0][1] declared as uint64_t A[2][2][2], this should be unpacked as A + x 256 + 64, but in GEPUnpack pass given by TAs just resolve this as A + x 256 + 0 128 + 1 64. With this constant propagation, the result is as follows.
Benchmark Results (After Sprint 1)
In matmul4, it is found that GEPUnpack is inefficient when it unpacks array pointer with offset 0. If we access A[x][0][1] declared as uint64_t A[2][2][2], this should be unpacked as A + x 256 + 64, but in GEPUnpack pass given by TAs just resolve this as A + x 256 + 0 128 + 1 64. With this constant propagation, the result is as follows.