Update the matrix multiplication benchmark to the form used in the final version of the paper, with more scalars + ops and 1 more matrix. Also cleaned things up a bit to better control number of test runs for each matrix and manage those more automatically
Update the matrix multiplication benchmark to the form used in the final version of the paper, with more scalars + ops and 1 more matrix. Also cleaned things up a bit to better control number of test runs for each matrix and manage those more automatically