Estimator variance does not decrease with number of shots taken

nonhermitian commented 2 years ago

Describe the bug The variance of an observable should decrease with the number of shots taken. However, this seems to not be the case, eg.


N = 4
qc1 = QuantumCircuit(N)
qc1.x(N-1)
qc1.h(range(N))
qc1.cx(range(N-1),N-1)
qc1.h(range(N-1))

op = Pauli('IZZZ')

estimator = Estimator(circuits=[qc1], observables=[op], service=service, options={ "backend": backend })

out = []
shots = [100, 1000, 2500, 5000, 10000, 25000, 50000, 100000]
for samples in shots:
    results = estimator(circuit_indices=[0], observable_indices=[0], shots=samples)
    out.append(results.metadata[0]['variance'])

gives

suggesting that performing more shots has no additional value in determining the precision of expectation values

Steps to reproduce

Expected behavior

The variance should improve with the number of shots taken.

Suggested solutions

Additional Information

qiskit-ibm-runtime version: 0.4.0
Python version:
Operating system:

rathishcholarajan commented 2 years ago

@t-imamichi @ikkoham Thoughts?

rathishcholarajan commented 2 years ago

@nonhermitian Which backend are you using?

nonhermitian commented 2 years ago

Well it is independent of backend, but those particular results were on Ithaca

rathishcholarajan commented 2 years ago

thanks! (wanted to see if simulator or real backend)

ikkoham commented 2 years ago

Variance is not standard error (SEM). Variance is square of the standard deviation (SD). SEM can be calculated using Variance and shots.I asked the question, "Do we need SD or SEM in beta release? But the answer is not necessary. Personally, I think it is nice to add SEM.

t-imamichi commented 2 years ago

When we developed Estimator, we have not concluded which to include as the additional information to the expectation value, variance, standard error or standard deviation. So, we added variance as part of metadata so that we can modify it without changing the interface in the future.

There is a discussion to update the result class https://github.com/Qiskit/qiskit-terra/pull/8105. I think it's related to this issue.

nonhermitian commented 2 years ago

I personally would expect the variance, or standard deviation, to highlight the fact that increasing shots bounds the precision which is observable via repeated execution of a circuit. For example, the following data (for a different circuit than the one given here) shows this:

aeddins-ibm commented 1 year ago

Has there been further understanding on this in the year since the bug was marked high priority?

As a new runtime user encountering this issue, I feel returning an indicator of the uncertainty of the estimate (standard error or variance) is unambiguously the expected behavior here. To the point that new users will almost always wrongly assume variance is a measure of the uncertainty of the result, not the single-shot variance (or variance of the observable) that is currently returned. This confusion can be mostly avoided by instead returning the SEM (maybe {'std_err': sqrt(variance/num_shots)} instead of {'variance': variance}?) which is hopefully less ambiguous.

SamFerracin commented 2 months ago

@nonhermitian I have investigated this issue and in my opinion, it has been resolved.

Here is the code that I run (same as your original code, but with all the modifications necessary to avoid errors):

import matplotlib.pyplot as plt
from qiskit.circuit import QuantumCircuit
from qiskit.transpiler.preset_passmanagers import generate_preset_pass_manager
from qiskit_ibm_runtime import EstimatorV2 as Estimator, QiskitRuntimeService, Session, Batch
from qiskit.quantum_info import Pauli

service = QiskitRuntimeService()
backend = service.backend("ibm_bangkok")  

N = 4
circuit = QuantumCircuit(N)
circuit.x(N-1)
circuit.h(range(N))
circuit.cx(range(N-1),N-1)
circuit.h(range(N-1))

pm = generate_preset_pass_manager(backend=backend, optimization_level=3)
isa_circuit = pm.run(circuit)
isa_circuit.draw(idle_wires=False)

op = Pauli('IZZZ')
isa_op = op.apply_layout(isa_circuit.layout)

estimator = Estimator(backend)

jobs = []
with Batch(backend=backend) as batch:
    shots = [100, 1000, 2500, 5000, 10000, 25000, 50000, 100000]
    for samples in shots:
        estimator.options.default_shots = samples
        job = estimator.run([(isa_circuit, isa_op)])
        jobs.append(job)

results = [job.result() for job in jobs]

plt.scatter(shots, [r[0].data.stds for r in results])
plt.ylabel("std")
plt.xlabel("shots")

I have attached the figure that is produced.

In the figure, one can see how the std drops as the number of shots start to increase. Past a certain point, the increase is not as clean as one would expect (e.g., shots=50000 yields an std that is larger than at shots=25000). However, this may be due to a number of factors that are outside of the runtime's control, such as statistical fluctuations (the estimated std is itself a random variable), noise drifts in the HW, etc. We have a number of open issues regarding target_precision and default_shots that will allow us to improve our documentation. Our intent is to make sure that when these issues are closed, it will be very clear to the user that the estimator tries to do all it can to ensure the desired accuracy, but factors outside of its control may sometimes cause it to underperform (with respect to precision).

Let me know if you have additional questions or comments, or if I can close this issue.

Screenshot 2024-07-02 at 5 10 42 PM

Qiskit / qiskit-ibm-runtime

Estimator variance does not decrease with number of shots taken #337