Open GiggleLiu opened 5 years ago
julia> @benchmark zero_state(n, 1000) |> cu |> $(qcbm.circuit) seconds = 2
BenchmarkTools.Trial:
memory estimate: 16.70 MiB
allocs estimate: 7278
--------------
minimum time: 4.623 ms (0.00% GC)
median time: 10.226 ms (8.24% GC)
mean time: 11.168 ms (9.86% GC)
maximum time: 81.029 ms (89.50% GC)
--------------
samples: 180
evals/sample: 1
julia> @benchmark zero_state(n, 1000) |> $(cqcbm.circuit) seconds = 2
BenchmarkTools.Trial:
memory estimate: 8.02 MiB
allocs estimate: 2478
--------------
minimum time: 345.571 ms (0.00% GC)
median time: 360.031 ms (0.00% GC)
mean time: 358.910 ms (0.70% GC)
maximum time: 369.374 ms (4.10% GC)
--------------
samples: 6
evals/sample: 1
julia> @benchmark zero_state(n) |> cu |> $(qcbm.circuit) seconds = 2
BenchmarkTools.Trial:
memory estimate: 1.07 MiB
allocs estimate: 7121
--------------
minimum time: 1.597 ms (0.00% GC)
median time: 1.743 ms (0.00% GC)
mean time: 1.957 ms (8.67% GC)
maximum time: 77.709 ms (96.39% GC)
--------------
samples: 1021
evals/sample: 1
julia> @benchmark zero_state(n) |> $(cqcbm.circuit) seconds = 2
BenchmarkTools.Trial:
memory estimate: 234.52 KiB
allocs estimate: 2781
--------------
minimum time: 205.896 μs (0.00% GC)
median time: 212.959 μs (0.00% GC)
mean time: 247.828 μs (13.21% GC)
maximum time: 75.570 ms (99.60% GC)
--------------
samples: 8002
evals/sample: 1
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 48
On-line CPU(s) list: 0-47
Thread(s) per core: 2
Core(s) per socket: 12
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Stepping: 1
CPU MHz: 2523.984
BogoMIPS: 4401.45
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 30720K
NUMA node0 CPU(s): 0-11,24-35
NUMA node1 CPU(s): 12-23,36-47
GPU:
Model: Tesla P100-PCIE-12GB
IRQ: 74
GPU UUID: GPU-a78e4979-19e4-4d0e-ebc7-66348ddd11b3
Video BIOS: 86.00.3a.00.02
Bus Type: PCIe
DMA Size: 47 bits
DMA Mask: 0x7fffffffffff
Bus Location: 0000:04:00.0
Device Minor: 0
julia> @benchmark zero_state(n, 1000) |> cu |> $(qcbm.circuit) seconds = 2
BenchmarkTools.Trial:
memory estimate: 16.70 MiB
allocs estimate: 7278
--------------
minimum time: 4.713 ms (0.00% GC)
median time: 12.068 ms (7.94% GC)
mean time: 12.484 ms (9.53% GC)
maximum time: 80.318 ms (91.10% GC)
--------------
samples: 161
evals/sample: 1
julia> @benchmark zero_state(n, 1000) |> $(cqcbm.circuit) seconds = 2
BenchmarkTools.Trial:
memory estimate: 8.02 MiB
allocs estimate: 2478
--------------
minimum time: 382.711 ms (0.00% GC)
median time: 384.631 ms (0.00% GC)
mean time: 386.760 ms (0.65% GC)
maximum time: 396.166 ms (3.78% GC)
--------------
samples: 6
evals/sample: 1
julia> @benchmark zero_state(n) |> cu |> $(qcbm.circuit) seconds = 2
BenchmarkTools.Trial:
memory estimate: 1.07 MiB
allocs estimate: 7121
--------------
minimum time: 1.620 ms (0.00% GC)
median time: 1.674 ms (0.00% GC)
mean time: 1.900 ms (8.67% GC)
maximum time: 77.474 ms (96.30% GC)
--------------
samples: 1051
evals/sample: 1
julia> @benchmark zero_state(n) |> $(cqcbm.circuit) seconds = 2
BenchmarkTools.Trial:
memory estimate: 234.52 KiB
allocs estimate: 2781
--------------
minimum time: 209.335 μs (0.00% GC)
median time: 216.347 μs (0.00% GC)
mean time: 251.826 μs (13.06% GC)
maximum time: 75.510 ms (99.59% GC)
--------------
samples: 7876
evals/sample: 1%
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 24
On-line CPU(s) list: 0-23
Thread(s) per core: 1
Core(s) per socket: 12
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Stepping: 1
CPU MHz: 1200.031
BogoMIPS: 4401.55
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 30720K
NUMA node0 CPU(s): 0-11
NUMA node1 CPU(s): 12-23
GPU:
Model: Tesla M40
IRQ: 74
GPU UUID: GPU-????????-????-????-????-????????????
Video BIOS: ??.??.??.??.??
Bus Type: PCIe
DMA Size: 40 bits
DMA Mask: 0xffffffffff
Bus Location: 0000:04:00.0
Device Minor: 0
julia> @benchmark zero_state(n, 1000) |> cu |> $(qcbm.circuit) seconds = 2
BenchmarkTools.Trial:
memory estimate: 16.70 MiB
allocs estimate: 7278
--------------
minimum time: 4.884 ms (0.00% GC)
median time: 6.476 ms (17.07% GC)
mean time: 6.986 ms (18.01% GC)
maximum time: 110.769 ms (94.64% GC)
--------------
samples: 286
evals/sample: 1
julia> @benchmark zero_state(n, 1000) |> $(cqcbm.circuit) seconds = 2
BenchmarkTools.Trial:
memory estimate: 8.02 MiB
43 function iterate(qo::QCBMOptimizer, state::Int=1)
allocs estimate: 2478
--------------
minimum time: 396.184 ms (0.00% GC)
median time: 397.585 ms (0.00% GC)
mean time: 401.956 ms (1.01% GC)
maximum time: 418.597 ms (4.83% GC)
--------------
samples: 5
evals/sample: 1
julia> @benchmark zero_state(n) |> cu |> $(qcbm.circuit) seconds = 2
BenchmarkTools.Trial:
memory estimate: 1.07 MiB
allocs estimate: 7121
--------------
minimum time: 1.635 ms (0.00% GC)
median time: 2.631 ms (0.00% GC)
mean time: 3.069 ms (11.31% GC)
maximum time: 114.555 ms (96.01% GC)
--------------
samples: 651
evals/sample: 1
julia> @benchmark zero_state(n) |> $(cqcbm.circuit) seconds = 2
BenchmarkTools.Trial:
memory estimate: 234.52 KiB
allocs estimate: 2781
--------------
minimum time: 236.792 μs (0.00% GC)
median time: 453.363 μs (0.00% GC)
mean time: 523.526 μs (13.50% GC)
maximum time: 109.883 ms (99.48% GC)
--------------
samples: 3792
evals/sample: 1
CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 48
On-line CPU(s) list: 0-47
Thread(s) per core: 2
Core(s) per socket: 12
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
Stepping: 1
CPU MHz: 2499.921
CPU max MHz: 2900.0000
CPU min MHz: 1200.0000
BogoMIPS: 4401.27
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 30720K
NUMA node0 CPU(s): 0-11,24-35
NUMA node1 CPU(s): 12-23,36-47
GPU:
Model: TITAN V
IRQ: 98
GPU UUID: GPU-f04d8db3-bb77-b4ee-cd2e-b666cd0fd0ea
Video BIOS: 88.00.41.00.12
Bus Type: PCIe
DMA Size: 47 bits
DMA Mask: 0x7fffffffffff
Bus Location: 0000:04:00.0
Device Minor: 0
9 qubit QCBM circuit with depth 8
Batched Performance
Single Run Performance
Platform