Closed kdk closed 1 year ago
Here is the result of profiling before and after the commit sorted by the functions observing the biggest time difference;
filename(function) tottime_9165f396 tottime_9e78907d tottime_diff
qiskit/circuit/gate.py(_broadcast_2_arguments) 0.233 0.061 0.172
qiskit/circuit/quantumcircuit.py(_check_qargs) 0.208 0.102 0.106
qiskit/circuit/quantumcircuit.py(<genexpr>) 0.116 0.012 0.104
qiskit/circuit/quantumcircuit.py(<genexpr>) 0.116 0.015 0.101
qiskit/circuit/controlledgate.py(__init__) 0.165 0.072 0.093
qiskit/circuit/quantumcircuit.py(<listcomp>) 0.075 0.000 0.075
qiskit/circuit/quantumcircuit.py(<listcomp>) 0.075 0.000 0.075
qiskit/circuit/quantumcircuit.py(<listcomp>) 0.075 0.000 0.075
qiskit/circuit/quantumcircuit.py(<genexpr>) 0.116 0.056 0.060
qiskit/circuit/quantumcircuit.py(<listcomp>) 0.075 0.020 0.055
qiskit/circuit/library/standard_gates/x.py(__init__) 0.090 0.039 0.051
qiskit/circuit/quantumcircuit.py(<genexpr>) 0.058 0.012 0.046
qiskit/circuit/controlledgate.py(definition) 0.093 0.049 0.044
qiskit/circuit/quantumcircuit.py(<genexpr>) 0.058 0.015 0.043
qiskit/circuit/instruction.py(params) 0.056 0.019 0.037
qiskit/circuit/quantumcircuit.py(append) 0.427 0.393 0.034
{built-in method builtins.isinstance} 0.223 0.190 0.033
qiskit/circuit/quantumcircuit.py(_append) 0.243 0.218 0.025
qiskit/circuit/controlledgate.py(num_ctrl_qubits) 0.030 0.009 0.021
qiskit/circuit/quantumcircuit.py(<listcomp>) 0.020 0.000 0.020
❓ ✅ PR #4392 does bring in lots of performance regression flags on asv (210 to be precise) Link - https://qiskit.github.io/qiskit/#regressions?sort=1&dir=desc and search for 9165f396
I am not sure how to classify this since the past data with the test system stops at Terra 0.18.1 with a change in python version and a newer system for the later tests (as I understand, the CPU upgrade has a newer architecture and I am not sure if there was a change in its RAM type which may not be representative of a proper comparison of the data).
Its a bit difficult to quantify for the current version since the testing bench suite has changed the hardware significantly. Personally, if I were to extrapolate the upgrade with the lower execution times around 0.18.1, I would probably also consider the upgraded hardware to contribute to the recorded speedup in the later testing and if we are going by the trends prior to the hardware upgrade, the spike loss that was present still somewhat carries over and as I understand, there have been a few files here that have been updated to improve performance or some have degraded due to addition of features.
It could be worthwhile to look into the parts that weren't touched much since PR 4392 to see if there could be a culprit caught.
Will update comment/add more comments if cause is found/fixed!
Python version 3.9.7 qiskit-terra version: 0.19.1
Based on https://qiskit.github.io/qiskit/#circuit_construction.CircuitConstructionBench.time_circuit_construction?machine=dedicated-benchmarking-softlayer-baremetal&os=Linux%204.15.0-46-generic&ram=16GB&p-width=8&p-gates=32768&commits=9165f396, the performance from that before #4392 was not recovered yet (actually, now is even slightly worse than immediately after #4392). Therefore, I think the issue is still valid. However, I'm not sure if would be possible or worthy to hunt down the reason at this point. @kdk ?
It's not actually as clear cut as that graph would make it seem. The benchmark systems have been moved around a bunch since this was opened. The graph you're looking at stopped reporting data because the server was retired. If you look at:
that shows the replacement system continuing in a different color after the first server was retired. However since it's different hardware and a newer python version it's hard to tell where the performance levels match up. We have seen general improvements in construction time since this was opened (primarily from #7618) but whether this has recovered from that regression will be hard to tell short of doing fresh a/b comparison benchmarks to compare before the regression, to after the regression, and against current main on a consistent environment.
I'm going to close this issue as stale now, since we've lost the point of comparison - it's hard to classify it as a regression any more. We're still interested in improving performance of our core methods, but this issue highlights a specific regression that is no longer relevant.
Information
What is the current behavior?
The benchmarks flagged a 5-40% regression across the circuit construction, assembly/disassembly, and converter benchmarks following #4392 :
https://qiskit.github.io/qiskit/#regressions?sort=1&dir=desc
One example:
https://qiskit.github.io/qiskit/#circuit_construction.CircuitConstructionBench.time_circuit_construction?machine=dedicated-benchmarking-softlayer-baremetal&os=Linux%204.15.0-46-generic&ram=16GB&p-width=8&p-gates=32768&commits=9165f396