Open mtreinish opened 6 days ago
One or more of the following people are relevant to this code:
@Cryoris
@Qiskit/terra-core
@ajavadia
@kevinhartman
@levbishop
@mtreinish
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
crates/accelerate/src/euler_one_qubit_decomposer.rs | 218 | 219 | 99.54% | ||
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py | 36 | 38 | 94.74% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 2 | 15 | 13.33% | ||
crates/circuit/src/dag_node.rs | 25 | 39 | 64.1% | ||
crates/circuit/src/operations.rs | 30 | 46 | 65.22% | ||
<!-- | Total: | 354 | 400 | 88.5% | --> |
Files with Coverage Reduction | New Missed Lines | % | ||
---|---|---|---|---|
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py | 1 | 95.35% | ||
crates/qasm2/src/lex.rs | 3 | 93.38% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 4 | 70.67% | ||
crates/accelerate/src/euler_one_qubit_decomposer.rs | 4 | 91.37% | ||
crates/qasm2/src/parse.rs | 6 | 97.61% | ||
<!-- | Total: | 18 | --> |
Totals | |
---|---|
Change from base Build 9650973845: | 0.04% |
Covered Lines: | 64062 |
Relevant Lines: | 71342 |
This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
crates/accelerate/src/euler_one_qubit_decomposer.rs | 217 | 218 | 99.54% | ||
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py | 40 | 42 | 95.24% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 2 | 15 | 13.33% | ||
crates/circuit/src/dag_node.rs | 25 | 39 | 64.1% | ||
crates/circuit/src/operations.rs | 30 | 46 | 65.22% | ||
<!-- | Total: | 357 | 403 | 88.59% | --> |
Files with Coverage Reduction | New Missed Lines | % | ||
---|---|---|---|---|
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py | 1 | 95.45% | ||
crates/qasm2/src/expr.rs | 1 | 94.02% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 4 | 70.67% | ||
crates/qasm2/src/lex.rs | 4 | 91.86% | ||
crates/accelerate/src/euler_one_qubit_decomposer.rs | 4 | 91.35% | ||
crates/qasm2/src/parse.rs | 6 | 97.61% | ||
<!-- | Total: | 20 | --> |
Totals | |
---|---|
Change from base Build 9650973845: | 0.03% |
Covered Lines: | 64056 |
Relevant Lines: | 71343 |
I took a quick attempt at implementing this with multithreading locally and I could get it to work but it required doing extra collection of all the data from the collect runs into a second vec to strip the PyRef
from each node. This added more overhead than parallelism saved in this case. I think it should be possible to investigate doing this in parallel after #12550 merges because then we can generate the runs from rust directly. For multithreading we'll likely still need to collect the runs into a vec (to avoid mutation while iterating) but we can measure if that overhead is higher than the speedup from doing the synthesis in parallel and investigate in a followup.
I threw together a quick benchmark to test this:
import statistics
import time
from qiskit.transpiler.passes import Optimize1qGatesDecomposition
from qiskit.circuit.random import random_circuit
from qiskit.converters import circuit_to_dag
seed = 42
n_qubits = 20
depth = 102400
qc = random_circuit(n_qubits, depth, measure=True, seed=seed)
dag = circuit_to_dag(qc)
basis_gates = ['sx', 'x', 'rz', 'cx', 'id']
opt_pass = Optimize1qGatesDecomposition(basis=basis_gates)
times = []
for i in range(10):
start = time.perf_counter()
opt_pass.run(dag)
stop = time.perf_counter()
times.append(stop - start)
print(f"Mean runtime: {statistics.geometric_mean(times)}")
and it's showing a pretty noticeable speedup over Qiskit 1.1. With this PR I got:
3.931521232593536 sec
and with Qiskit 1.1 I got:
6.184648169533495 sec
Looking at a profile with this script the majority of the time with this PR is spent in collect 1q runs and dag interactions in python.
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
crates/accelerate/src/euler_one_qubit_decomposer.rs | 234 | 235 | 99.57% | ||
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py | 40 | 42 | 95.24% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 2 | 15 | 13.33% | ||
crates/circuit/src/dag_node.rs | 25 | 39 | 64.1% | ||
crates/circuit/src/operations.rs | 30 | 46 | 65.22% | ||
<!-- | Total: | 374 | 420 | 89.05% | --> |
Files with Coverage Reduction | New Missed Lines | % | ||
---|---|---|---|---|
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py | 1 | 95.45% | ||
crates/qasm2/src/lex.rs | 3 | 92.62% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 4 | 70.67% | ||
crates/accelerate/src/euler_one_qubit_decomposer.rs | 4 | 91.51% | ||
<!-- | Total: | 12 | --> |
Totals | |
---|---|
Change from base Build 9661630219: | 0.08% |
Covered Lines: | 64079 |
Relevant Lines: | 71360 |
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
crates/accelerate/src/euler_one_qubit_decomposer.rs | 234 | 235 | 99.57% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 2 | 15 | 13.33% | ||
crates/circuit/src/dag_node.rs | 25 | 39 | 64.1% | ||
crates/circuit/src/operations.rs | 30 | 46 | 65.22% | ||
<!-- | Total: | 374 | 418 | 89.47% | --> |
Files with Coverage Reduction | New Missed Lines | % | ||
---|---|---|---|---|
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py | 1 | 96.92% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 4 | 70.67% | ||
crates/accelerate/src/euler_one_qubit_decomposer.rs | 4 | 91.51% | ||
crates/qasm2/src/lex.rs | 5 | 92.62% | ||
<!-- | Total: | 14 | --> |
Totals | |
---|---|
Change from base Build 9661630219: | 0.09% |
Covered Lines: | 64083 |
Relevant Lines: | 71358 |
To compare this against the current state of main I ran asv for the pass specific benchmarks:
Benchmarks that have improved:
| Change | Before [1ed5951a] <optimize-1q-more-rust~3^2> | After [130969a3] <optimize-1q-more-rust> | Ratio | Benchmark (Parameter) |
|----------|-------------------------------------------------|--------------------------------------------|---------|---------------------------------------------------------------------------------------------------------------|
| - | 134±1ms | 46.9±0.4ms | 0.35 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
| - | 99.3±1ms | 34.0±0.2ms | 0.34 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
| - | 116±0.7ms | 39.0±0.5ms | 0.34 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['u', 'cx', 'id']) |
| - | 85.6±0.3ms | 27.9±0.1ms | 0.33 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['u', 'cx', 'id']) |
| - | 123±1ms | 41.1±0.5ms | 0.33 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(20, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| - | 92.0±0.5ms | 29.6±0.2ms | 0.32 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(14, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| - | 47.4±0.5ms | 12.6±0.04ms | 0.27 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
| - | 43.6±0.3ms | 10.6±0.05ms | 0.24 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| - | 40.7±0.4ms | 9.31±0.06ms | 0.23 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_decompose(5, 1024, ['u', 'cx', 'id']) |
Benchmarks that have stayed the same:
| Change | Before [1ed5951a] <optimize-1q-more-rust~3^2> | After [130969a3] <optimize-1q-more-rust> | Ratio | Benchmark (Parameter) |
|----------|-------------------------------------------------|--------------------------------------------|---------|-----------------------------------------------------------------------------------------------------------------|
| | 2.65±0.01s | 2.66±0.02s | 1 | passes.PassBenchmarks.time_optimize_swap_before_measure(20, 1024) |
| | 231±1ms | 230±1ms | 1 | passes.PassBenchmarks.time_optimize_swap_before_measure(5, 1024) |
| | 737±10ms | 719±6ms | 0.98 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_commutation(14, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| | 1.53±0.01s | 1.51±0s | 0.98 | passes.PassBenchmarks.time_optimize_swap_before_measure(14, 1024) |
| | 1.14±0.01s | 1.11±0s | 0.97 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_commutation(20, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| | 256±2ms | 247±1ms | 0.97 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_commutation(5, 1024, ['rx', 'ry', 'rz', 'r', 'rxx', 'id']) |
| | 822±10ms | 778±3ms | 0.95 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_commutation(14, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
| | 746±7ms | 706±2ms | 0.95 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_commutation(14, 1024, ['u', 'cx', 'id']) |
| | 1.26±0.01s | 1.20±0s | 0.95 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_commutation(20, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
| | 1.16±0.02s | 1.10±0s | 0.95 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_commutation(20, 1024, ['u', 'cx', 'id']) |
| | 298±2ms | 277±1ms | 0.93 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_commutation(5, 1024, ['rz', 'x', 'sx', 'cx', 'id']) |
| | 260±2ms | 239±1ms | 0.92 | passes.MultipleBasisPassBenchmarks.time_optimize_1q_commutation(5, 1024, ['u', 'cx', 'id']) |
SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.
This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
crates/accelerate/src/euler_one_qubit_decomposer.rs | 234 | 235 | 99.57% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 2 | 15 | 13.33% | ||
crates/circuit/src/dag_node.rs | 25 | 39 | 64.1% | ||
crates/circuit/src/operations.rs | 30 | 46 | 65.22% | ||
<!-- | Total: | 375 | 419 | 89.5% | --> |
Files with Coverage Reduction | New Missed Lines | % | ||
---|---|---|---|---|
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py | 1 | 96.92% | ||
crates/qasm2/src/expr.rs | 1 | 94.02% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 4 | 70.67% | ||
qiskit/synthesis/discrete_basis/solovay_kitaev.py | 4 | 94.74% | ||
crates/accelerate/src/euler_one_qubit_decomposer.rs | 4 | 91.51% | ||
crates/qasm2/src/lex.rs | 5 | 92.62% | ||
qiskit/providers/fake_provider/generic_backend_v2.py | 10 | 94.71% | ||
<!-- | Total: | 29 | --> |
Totals | |
---|---|
Change from base Build 9661630219: | 0.08% |
Covered Lines: | 64084 |
Relevant Lines: | 71365 |
We should merge this after #12662 merges since this depends on having an RGate implementation in rust. I did it inline here but it would be good to review that in isolation, I'll rebase this after that merges.
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
crates/accelerate/src/euler_one_qubit_decomposer.rs | 234 | 235 | 99.57% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 2 | 15 | 13.33% | ||
crates/circuit/src/dag_node.rs | 25 | 39 | 64.1% | ||
<!-- | Total: | 335 | 363 | 92.29% | --> |
Files with Coverage Reduction | New Missed Lines | % | ||
---|---|---|---|---|
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py | 1 | 96.92% | ||
qiskit/transpiler/passes/synthesis/unitary_synthesis.py | 2 | 88.2% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 4 | 70.67% | ||
crates/qasm2/src/lex.rs | 4 | 92.37% | ||
crates/accelerate/src/euler_one_qubit_decomposer.rs | 4 | 91.51% | ||
crates/qasm2/src/parse.rs | 6 | 97.61% | ||
<!-- | Total: | 21 | --> |
Totals | |
---|---|
Change from base Build 9694988199: | 0.08% |
Covered Lines: | 64110 |
Relevant Lines: | 71373 |
This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
crates/accelerate/src/euler_one_qubit_decomposer.rs | 234 | 235 | 99.57% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 2 | 15 | 13.33% | ||
crates/circuit/src/dag_node.rs | 24 | 38 | 63.16% | ||
<!-- | Total: | 334 | 362 | 92.27% | --> |
Files with Coverage Reduction | New Missed Lines | % | ||
---|---|---|---|---|
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py | 1 | 96.92% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 4 | 70.67% | ||
crates/accelerate/src/euler_one_qubit_decomposer.rs | 4 | 91.51% | ||
crates/qasm2/src/lex.rs | 5 | 92.62% | ||
<!-- | Total: | 14 | --> |
Totals | |
---|---|
Change from base Build 9698604147: | 0.07% |
Covered Lines: | 64109 |
Relevant Lines: | 71370 |
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
crates/accelerate/src/euler_one_qubit_decomposer.rs | 234 | 235 | 99.57% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 2 | 15 | 13.33% | ||
crates/circuit/src/dag_node.rs | 24 | 38 | 63.16% | ||
<!-- | Total: | 334 | 362 | 92.27% | --> |
Files with Coverage Reduction | New Missed Lines | % | ||
---|---|---|---|---|
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py | 1 | 96.92% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 4 | 70.67% | ||
crates/qasm2/src/lex.rs | 4 | 92.37% | ||
crates/accelerate/src/euler_one_qubit_decomposer.rs | 4 | 91.51% | ||
<!-- | Total: | 13 | --> |
Totals | |
---|---|
Change from base Build 9703107599: | 0.08% |
Covered Lines: | 64076 |
Relevant Lines: | 71341 |
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 2 | 15 | 13.33% | ||
crates/circuit/src/dag_node.rs | 24 | 38 | 63.16% | ||
<!-- | Total: | 334 | 361 | 92.52% | --> |
Files with Coverage Reduction | New Missed Lines | % | ||
---|---|---|---|---|
qiskit/transpiler/passes/optimization/optimize_1q_decomposition.py | 1 | 96.92% | ||
crates/qasm2/src/expr.rs | 1 | 94.02% | ||
qiskit/synthesis/one_qubit/one_qubit_decompose.py | 4 | 70.67% | ||
crates/accelerate/src/euler_one_qubit_decomposer.rs | 4 | 93.41% | ||
crates/qasm2/src/lex.rs | 7 | 91.6% | ||
crates/qasm2/src/parse.rs | 18 | 95.77% | ||
<!-- | Total: | 35 | --> |
Totals | |
---|---|
Change from base Build 9747324734: | 0.03% |
Covered Lines: | 64655 |
Relevant Lines: | 71954 |
Summary
This commit moves to using rust gates for the Optimize1QGatesDecomposition transpiler pass. It takes in a sequence of runs (which are a list of DAGOpNodes) from the python side of the transpiler pass which are generated from DAGCircuit.collect_1q_runs() (which in the future should be moved to rust after #12550 merges). The rust portion of the pass now iterates over each run, performs the matrix multiplication to compute the unitary of the run, then synthesizes that unitary, computes the estimated error of the circuit synthesis and returns a tuple of the circuit sequence in terms of rust StandardGate enums. The python portion of the code then takes those sequences and does inplace substitution of each run with the sequence returned from rust.
Once #12550 merges we should be able to move the input collect_1q_runs() call and perform the output node substitions in rust making the full pass execute in the rust domain without any python interaction.
Additionally, the OneQubitEulerDecomposer class is updated to use rust for circuit generation instead of doing this python side. The internal changes done to use rust gates in the transpiler pass meant we were half way to this already by emitting rust StandardGates instead of python gate objects. The dag handling is still done in Python however until #12550 merges.
~This also includes an implementation of the r gate, I temporarily added this to unblock this effort as it was the only gate missing needed to complete this. We can rebase this if a standalone implementation of the gate merges before this.~ Rebased on top of: https://github.com/Qiskit/qiskit/pull/12662
Details and comments
TODO: