qiboteam / qibojit-benchmarks

Benchmark code for qibojit performance accessment
Apache License 2.0
2 stars 3 forks source link

simulation times from `library_benchmark` may be misleading #45

Open migueldiascosta opened 9 months ago

migueldiascosta commented 9 months ago

benchmarks run with library_benchmark seem to be including the time to transfer and convert the state to a numpy array in the simulation time instead of in the transfer time, as circuit_benchmark does

since the conversion to a numpy array is a single-threaded CPU task, this can dominate over the actual simulation time and hide what one is trying to compare

after an exchange with @scarrazza, I suppose this may be an intentional difference between library_benchmark (called from e.g. compare.py, which itself is called from e.g. scripts/qibojit.sh), and circuit_benchmark (called from main.py), so there may be nothing to fix, perhaps only made more clear

stavros11 commented 9 months ago

Thanks @migueldiascosta for raising this point. Indeed there is this asymmetry between the benchmark between qibo backends (circuit_benchmark) and other libraries (library_benchmark).

I cannot think of why this was intentional, other than that seperating these two times may have required a specialized treatment for each library. With the current structure, this could have been done by adding a to_numpy method in each backend under the libraries/ folder, but I guess we were not very interested at this figure for other libraries during the benchmark.

In the end it is a matter of defining the goal/figure of merit of the benchmark. For this benchmark the goal (expected outcome) for each library was set to be a np.ndarray in order to be fair, otherwise each library may return its own data type with less or more functionality for post-processing. If we are only interested in simulation, then we should indeed discard the transfer time, but we should keep in mind that in practice one may want to print/save to disk/calculate something using the state after simulation so some transfer time may be involved, most likely different depending on library, operation to be done and how efficiently it is implemented.