Time array initialisation and read back

I may be beneficial to provide timing for init_arrays and read_arrays. This is useful for measuring migration performance of USM models.

In the extreme case, a page migration heuristic that pins data on the device and never migrates to the host will show normal bandwidth for the five kernels but the benchmark will take considerably longer to actually complete. Most of the time will be spent on copying between host and device (init_arrays and read_arrays).

UoB-HPC / BabelStream

Time array initialisation and read back #161