Closed hillmich closed 4 years ago
You are right that the mean is a bad estimator. Unfornately I Google Benchmark is hardwired to it (it saves the total time and divides by the number of iterations). The only viable alternative I found is Facebooks folly and this seems to do it the same way.
A possible approach this is to have multiple repetitions (each consisting of a number of iterations) and take the minimum of the mean values of each repetition. A quick test with 5 repetition showed that the standard deviation is consistently below 1ns, so I doubt this will actually lead to a significant change the results.
I'm looking forward to hear what you think :)
Unfornately I Google Benchmark is hardwired to it
Yeah, that's very unfortunate. The cause of benchmarking time fluctuation I believe is mainly due to system schedule and machine performance fluctuation. Especially when there are occasionally overloaded tasks on the machine, which slows down the benchmark task for a few samples. Thus on one's single machine if the derivation is small doesn't mean the benchmark result is always not affected, but if the runner is careful with that, it is probably fine in most cases. Let's add a footnote for this at the result page maybe?
Let's add a footnote for this at the result page maybe?
Good idea, I've added a footnote.
Maybe the people behind Google benchmark can be pursuaded to include the min estimator given the explation provided by your link :)
Nice work! Thanks. I'll let you know when I finish running the new benchmark on our test machine.
Hi,
tldr: this PR does two things:
If you consider merging but something is bothering you, please don't hesitate to reach out.
Longer version: The code regarding the JKQ-DDSIM ist found in the
jkq-ddsim
directory and is accompanied by a README file.install.sh
clones the git repository (or does a pull if called more than once) and builds the binary for the benchmark.benchmarks.sh
calls the binary and sets the parameters such that the results are saved as json file.I implemented the benchmark to match my understanding of how they should be performed: Circuit generation (
QuantumCircuit
) is not included in the measurements, only the actual simulation time. The benchmark function mimick what would happen, when a user executes the simulator with a circuit description as input, i.e. construct theQuantumCircuit
, initialize the simulator instance and finally perform the simulation.benchmark
andplot
inbin/
have been adjusted to incorporate the new tool.The results you can see in this pull request are incomplete. Unfortunately, I do not have CUDA available on my computer server, qulacs wouldn't install and quest/pyquest-cffi wouldn't run. Hence these are commentend out in the scripts. I would ask to you reenable them and run the evaluation on your end (to additonally confirm we didn't cheat ;) ). You may want to change the max number of qubits for our QCBM benchmark, as vor >20 qubits, it takes ages. Decision diagrams are not suited for random circuits.
Lastly, a quick check against the contribution guide:
benchmark_all
andbenchmark_all_parallel
benchmark_misc
which delegates to a shell scriptjkq-ddsim/benchmarks.sh
data/jkq-ddsim.json
X
,H
,T
,CNOT
, andToffoli
for single qubit as well asQCBM
for the parameterized circuitBest regards, Stefan