Closed xumingkuan closed 11 months ago
Global changes:
CircuitSeq::random_gate_permutation()
Changes specific to the simulator:
std::unique_ptr
CircuitSeq
Schedule
Benchmark: qft, 33 qubits, 28 local qubits: Before:
2 stages. Kernel info: 11 kernels (8 fusion, 3 shared-memory), cost = 271.5, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 4 kernels (4 fusion, 0 shared-memory), cost = 25.4, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 0 seconds.
After (20 random permutations for each stage, taking ~5 seconds to run in total):
Kernel info: 16 kernels (11 fusion, 5 shared-memory), cost = 304.2, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 15 kernels (11 fusion, 4 shared-memory), cost = 299.5, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 12 kernels (8 fusion, 4 shared-memory), cost = 281, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 16 kernels (12 fusion, 4 shared-memory), cost = 297.8, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 15 kernels (11 fusion, 4 shared-memory), cost = 298.2, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 14 kernels (10 fusion, 4 shared-memory), cost = 293, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 13 kernels (9 fusion, 4 shared-memory), cost = 287.4, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 16 kernels (12 fusion, 4 shared-memory), cost = 298.1, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 15 kernels (10 fusion, 5 shared-memory), cost = 299.3, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 15 kernels (9 fusion, 6 shared-memory), cost = 299.4, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 12 kernels (7 fusion, 5 shared-memory), cost = 288.1, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 13 kernels (9 fusion, 4 shared-memory), cost = 285.6, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 15 kernels (11 fusion, 4 shared-memory), cost = 299.7, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 14 kernels (9 fusion, 5 shared-memory), cost = 292.4, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 16 kernels (12 fusion, 4 shared-memory), cost = 301.3, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 14 kernels (10 fusion, 4 shared-memory), cost = 291.3, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 14 kernels (10 fusion, 4 shared-memory), cost = 293.5, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 15 kernels (12 fusion, 3 shared-memory), cost = 295.1, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 14 kernels (9 fusion, 5 shared-memory), cost = 296.4, local qubits 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 Kernel info: 4 kernels (4 fusion, 0 shared-memory), cost = 25.6, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 5 kernels (5 fusion, 0 shared-memory), cost = 31.6, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 4 kernels (4 fusion, 0 shared-memory), cost = 25.6, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 4 kernels (4 fusion, 0 shared-memory), cost = 25.4, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 4 kernels (4 fusion, 0 shared-memory), cost = 25.4, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 4 kernels (4 fusion, 0 shared-memory), cost = 25.4, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 5 kernels (5 fusion, 0 shared-memory), cost = 31.6, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 5 kernels (5 fusion, 0 shared-memory), cost = 31.6, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 5 kernels (5 fusion, 0 shared-memory), cost = 31.6, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 4 kernels (4 fusion, 0 shared-memory), cost = 25.6, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 4 kernels (4 fusion, 0 shared-memory), cost = 25.6, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 4 kernels (4 fusion, 0 shared-memory), cost = 25.4, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 5 kernels (5 fusion, 0 shared-memory), cost = 31.6, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 4 kernels (4 fusion, 0 shared-memory), cost = 25.4, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 4 kernels (4 fusion, 0 shared-memory), cost = 25.6, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 4 kernels (4 fusion, 0 shared-memory), cost = 25.4, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 4 kernels (4 fusion, 0 shared-memory), cost = 25.6, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 4 kernels (4 fusion, 0 shared-memory), cost = 25.4, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4 Kernel info: 5 kernels (5 fusion, 0 shared-memory), cost = 31.6, local qubits 5 6 7 8 9 10 11 12 13 28 29 30 31 32 19 20 21 22 23 24 25 26 27 0 1 2 3 4
Most random permutations are worse than the initial given one. No one is better than the initial given sequence (even after permuting 100 times in 26 seconds).
Global changes:
CircuitSeq::random_gate_permutation()
to randomly select a valid topological order as the sequence.Changes specific to the simulator:
std::unique_ptr
forCircuitSeq
inSchedule
Benchmark: qft, 33 qubits, 28 local qubits: Before:
After (20 random permutations for each stage, taking ~5 seconds to run in total):
Most random permutations are worse than the initial given one. No one is better than the initial given sequence (even after permuting 100 times in 26 seconds).