Qiskit / qiskit

Qiskit is an open-source SDK for working with quantum computers at the level of extended quantum circuits, operators, and primitives.
https://www.ibm.com/quantum/qiskit
Apache License 2.0
4.83k stars 2.29k forks source link

A new StatevectorSampler times slower that the V1 implementation on a small number of qubits #12517

Open adekusar-drl opened 3 weeks ago

adekusar-drl commented 3 weeks ago

What should we add?

The implementation of the Sampler primitive is significantly slower than the V1 implementation on a small number of qubits.

Here is a script:

import time

import matplotlib.pyplot as plt
import numpy as np
from qiskit import QuantumCircuit
from qiskit.circuit.library import EfficientSU2
from qiskit.primitives import StatevectorSampler, Sampler as RefSampler

def time_ref_v1_sampler(qc: QuantumCircuit, data: np.ndarray, retries=100):
    ref_sampler = RefSampler()
    start = time.time()
    for _ in range(retries):
        ref_sampler.run([qc] * len(data), data).result()
    elapsed = time.time() - start
    print(f"Reference V1 Sampler: {elapsed:0.2f} sec")
    return elapsed

def time_ref_v2_sampler(qc: QuantumCircuit, data: np.ndarray, retries=100):
    sampler_v2 = StatevectorSampler()
    pubs = [(qc, data[i, :]) for i in range(len(data))]

    start = time.time()
    for _ in range(retries):
        sampler_v2.run(pubs).result()
    elapsed = time.time() - start
    print(f"Reference V2 Sampler: {elapsed:0.2f} sec")
    return elapsed

def run_comparison():
    v1_elapsed = []
    v2_elapsed = []
    qubits = [5, 6, 7, 8, 9, 10]
    for num_q in qubits:
        qc = EfficientSU2(num_q)
        qc.measure_all()
        data = np.random.random((10, qc.num_parameters))

        v1 = time_ref_v1_sampler(qc, data)
        v1_elapsed.append(v1)
        v2 = time_ref_v2_sampler(qc, data)
        v2_elapsed.append(v2)

    plt.plot(qubits, v1_elapsed, label="v1")
    plt.plot(qubits, v2_elapsed, label="v2")
    plt.xlabel("Num qubits")
    plt.ylabel("Time")
    plt.legend()
    plt.show()

if __name__ == '__main__':
    run_comparison()

Here is the result on Mac.

image

On the larger numbers, e.g. 15 qubits, the results are comparable.

Cryoris commented 3 weeks ago

After some benchmarking with @ElePT it seems that two possible culprits could be

That being said, in your example a significant chunk of time is spent in the removal of final measurements which could maybe be sped up (e.g. by porting to Rust).