BatchExperiment of QV Experiments with Too Many Trials Produces LinAlgError on Some Devices

qiskit-community / qiskit-experiments

Qiskit Experiments

https://qiskit-community.github.io/qiskit-experiments/

Apache License 2.0

161 stars 126 forks source link

BatchExperiment of QV Experiments with Too Many Trials Produces LinAlgError on Some Devices #846

Open albertzhu01 opened 2 years ago

albertzhu01 commented 2 years ago

Informations

Qiskit Experiments version: 0.4
Python version: 3.8
Operating system: Mac

What is the current behavior?

If QV experiments with too many trials (example below shows 800 trials run on IBMQ Toronto) are run in a BatchExperiment, the following error occurs:

File ~/opt/anaconda3/envs/pygsti2/lib/python3.8/site-packages/scipy/linalg/_decomp.py:1356, in _check_info(info, driver, positive)
   1353     raise ValueError('illegal value in argument %d of internal %s'
   1354                      % (-info, driver))
   1355 if info > 0 and positive:
-> 1356     raise LinAlgError(("%s " + positive) % (driver, info,))

LinAlgError: eig algorithm (geev) did not converge (only eigenvalues with order >= 2 have converged)

This varies from device to device.

Steps to reproduce the problem

from qiskit_experiments.framework import BatchExperiment
from qiskit_experiments.library import QuantumVolume
import qiskit

# Create QV experiments
seed = 123
qubits_list = [
    [0, 1, 2, 3], [4, 6, 7, 10], [5, 8, 9, 11], [12, 13, 14, 15, 16], [17, 18, 21, 23, 24], [19, 20, 22, 25, 26]
]
QV_exps = [
    QuantumVolume(
        qubits, 
        seed=seed
    )
    for qubits in qubits_list
]

# Set backend and transpile options
qiskit.IBMQ.load_account()
provider = qiskit.IBMQ.get_provider(hub="ibm-q-internal", group="deployed", project="default")
backend = provider.get_backend('ibmq_toronto')
transpile_options = {"basis_gates": backend.configuration().basis_gates, "optimization_level": 3}
for QV_exp in QV_exps:
    QV_exp.set_transpile_options(**transpile_options)
    QV_exp.set_experiment_options(trials=800)

# Run batch experment of QV experiments
QV_batch_exp = BatchExperiment(QV_exps)
QV_batch_data = QV_batch_exp.run(backend).block_for_results()

What is the expected behavior?

A job should be successfully created and run

Suggested solutions

albertzhu01 commented 2 years ago

Running the QV experiments individually, the error occurs consistently for qubit group [12, 13, 14, 15, 16] but not any other groups in the code above. The error also occurs on other devices for the exact same qubits (Hanoi).

levbishop commented 2 years ago

Can you try to narrow it down to a specific SU(4) that triggers the error? Presumably the randomization occasionally draws a particular SU(4) that fails in TwoQubitBasisDecomposer.num_basis_gates()

albertzhu01 commented 2 years ago

By changing the seed around, I've been able agle to trigger the error with the following 2 SU(4)s:

Instruction(name='unitary', num_qubits=2, num_clbits=0, params=[array([
       [-0.02645613-0.07057715j,  0.        +0.j        , -0.16789715-0.98291886j,  0.        +0.j        ],
       [-0.16789715+0.98291886j,  0.        +0.j        ,  0.02645613-0.07057715j,  0.        +0.j        ],
       [ 0.        +0.j        , -0.02645613-0.07057715j,  0.        +0.j        , -0.16789715-0.98291886j],
       [ 0.        +0.j        , -0.16789715+0.98291886j,  0.        +0.j        ,  0.02645613-0.07057715j]])])

and

Instruction(name='unitary', num_qubits=2, num_clbits=0, params=[array([
       [ 0.35857659-0.1328184j ,  0.        +0.j        ,  0.91071947-0.15611584j,  0.        +0.j        ],
       [ 0.91071947+0.15611584j,  0.        +0.j        , -0.35857659-0.1328184j ,  0.        +0.j        ],
       [ 0.        +0.j        ,  0.35857659-0.1328184j ,  0.        +0.j        ,  0.91071947-0.15611584j],
       [ 0.        +0.j        ,  0.91071947+0.15611584j,  0.        +0.j        , -0.35857659-0.1328184j ]])])

However, I've seen the first SU(4) above get generated in an experiment with a different seed and the error did not get triggered

levbishop commented 2 years ago

THanks @albertzhu01 when I try to use those SU(4) I don't have any problem. I'm not sure if this is something about my setup vs yours, or if its an issue with the rounding of the array for the printout above. The way I tested was with:

qiskit.quantum_info.two_qubit_cnot_decompose.num_basis_gates(np.array([
       [-0.02645613-0.07057715j,  0.        +0.j        , -0.16789715-0.98291886j,  0.        +0.j        ],
       [-0.16789715+0.98291886j,  0.        +0.j        ,  0.02645613-0.07057715j,  0.        +0.j        ],
       [ 0.        +0.j        , -0.02645613-0.07057715j,  0.        +0.j        , -0.16789715-0.98291886j],
       [ 0.        +0.j        , -0.16789715+0.98291886j,  0.        +0.j        ,  0.02645613-0.07057715j]])

Can you try it and see if it fails for you? And if not maybe you can dump the failing matrices with enough digits of precision so we can load recreate them bitwise-identical.

albertzhu01 commented 2 years ago

The above works for me (I get num_basis_gates = 3), and for some reason when I run QV with the exact same seed and parameters again, the above array does not produce the error. Instead, I get the error with the following SU(4):

[[ 0.236623607031252+0.96684108969098j ,  0.               +0.j               ,  0.034648743991892-0.089593752128496j,  0.               +0.j               ],
 [ 0.034648743991892+0.089593752128496j,  0.               +0.j               , -0.236623607031252+0.96684108969098j ,  0.               +0.j               ],
 [ 0.               +0.j               ,  0.236623607031252+0.96684108969098j ,  0.               +0.j               ,  0.034648743991892-0.089593752128496j],
 [ 0.               +0.j               ,  0.034648743991892+0.089593752128496j,  0.               +0.j               , -0.236623607031252+0.96684108969098j ]]

(This has 15 digits of precision)

However, if I just run qiskit.quantum_info.two_qubit_cnot_decompose.num_basis_gates on that array, I don't get any errors. Could there be some other randomness happening that just causes the eigenvalue algorithm to not converge sometimes?

For the second SU(4) in my comment above, I'm able to consistently get the error, but similarly it does not result in an error if I just run qiskit.quantum_info.two_qubit_cnot_decompose.num_basis_gates. Here's that SU(4) with 15 digits of precision:

[[ 0.358576594639192-0.132818396029565j,  0.               +0.j               ,  0.910719465407508-0.156115837700592j,  0.               +0.j               ],
 [ 0.910719465407508+0.156115837700592j,  0.               +0.j               , -0.358576594639192-0.132818396029565j,  0.               +0.j               ],
 [ 0.               +0.j               ,  0.358576594639192-0.132818396029565j,  0.               +0.j               ,  0.910719465407508-0.156115837700592j],
 [ 0.               +0.j               ,  0.910719465407508+0.156115837700592j,  0.               +0.j               , -0.358576594639192-0.132818396029565j]]

albertzhu01 commented 2 years ago

Update: The following lines give the error when I run them on my computer, but Helena does not get the error. We both have the same qiskit-terra version, so we're not sure what could be the discrepancy (besides potential differences in the versions of other packages).

from qiskit.transpiler.passes.synthesis import unitary_synthesis
decomposer = unitary_synthesis._basis_gates_to_decomposer_2q(['id', 'rz', 'sx', 'x', 'cx', 'reset'])
decomposer.num_basis_gates([[(0.23662360703125174+0.9668410896909799j), 0j, (0.03464874399189194-0.08959375212849645j), 0j], [(0.03464874399189194+0.08959375212849645j), 0j, (-0.23662360703125174+0.9668410896909799j), 0j], [0j, (0.23662360703125174+0.9668410896909799j), 0j, (0.03464874399189194-0.08959375212849645j)], [0j, (0.03464874399189194+0.08959375212849645j), 0j, (-0.23662360703125174+0.9668410896909799j)]])

levbishop commented 2 years ago

The above lines do not error for me. I note that all the failing cases are SWAP.(U()*I) which is a high-symmetry corner of the Weyl chamber. The reason for all the complexities of the TwoQubitWeylDecomposition stuff is exactly because of pathologies around such high-symmetry points, so maybe this is yet another example of such.

It would be good to nail down which difference between @albertzhu01 setup and my/Helena's is sufficient to trigger the error.

levbishop commented 2 years ago

You might try something like the following to see how common these failures can be for you

for _ in range(100_000):
    TwoQubitWeylDecomposition(np.kron(random_unitary(2), random_unitary(2)) @ swap)

on my setup I've tested millions of cases and they all pass.

The terra testsuite exercises a bunch of the pathological cases. It would be interesting to see if the testsuite passes on your setup. Something like python -m unittest test.python.quantum_info.test_synthesis -cb should run the relevant pieces

albertzhu01 commented 2 years ago

The following test case reliably reproduces the error:

import numpy as np
from scipy import linalg as la
mat = np.array(
    [
        [0.9999999999999998j, 0j, -3.469446951953614e-18j, 0j], 
        [0j, 0.9999999999999998j, (8.673617379884035e-19+2.6020852139652106e-18j), -6.938893903907228e-18j], 
        [-3.469446951953614e-18j, (8.673617379884035e-19+2.6020852139652106e-18j), 0.9999999999999998j, 0j], 
        [0j, -6.938893903907228e-18j, 0j, 0.9999999999999998j]
    ]
)
la.eigvals(mat)

Output:

LinAlgError                               Traceback (most recent call last)
Input In [17], in <cell line: 9>()
      1 from scipy import linalg as la
      2 mat = np.array(
      3     [
      4         [0.9999999999999998j, 0j, -3.469446951953614e-18j, 0j], 
   (...)
      7         [0j, -6.938893903907228e-18j, 0j, 0.9999999999999998j]]
      8 )
----> 9 la.eigvals(mat)

File ~/opt/anaconda3/envs/pygsti2/lib/python3.8/site-packages/scipy/linalg/_decomp.py:880, in eigvals(a, b, overwrite_a, check_finite, homogeneous_eigvals)
    811 def eigvals(a, b=None, overwrite_a=False, check_finite=True,
    812             homogeneous_eigvals=False):
    813     """
    814     Compute eigenvalues from an ordinary or generalized eigenvalue problem.
    815 
   (...)
    878 
    879     """
--> 880     return eig(a, b=b, left=0, right=0, overwrite_a=overwrite_a,
    881                check_finite=check_finite,
    882                homogeneous_eigvals=homogeneous_eigvals)

File ~/opt/anaconda3/envs/pygsti2/lib/python3.8/site-packages/scipy/linalg/_decomp.py:247, in eig(a, b, left, right, overwrite_a, overwrite_b, check_finite, homogeneous_eigvals)
    244     w = wr + _I * wi
    245     w = _make_eigvals(w, None, homogeneous_eigvals)
--> 247 _check_info(info, 'eig algorithm (geev)',
    248             positive='did not converge (only eigenvalues '
    249                      'with order >= %d have converged)')
    251 only_real = numpy.all(w.imag == 0.0)
    252 if not (geev.typecode in 'cz' or only_real):

File ~/opt/anaconda3/envs/pygsti2/lib/python3.8/site-packages/scipy/linalg/_decomp.py:1356, in _check_info(info, driver, positive)
   1353     raise ValueError('illegal value in argument %d of internal %s'
   1354                      % (-info, driver))
   1355 if info > 0 and positive:
-> 1356     raise LinAlgError(("%s " + positive) % (driver, info,))

LinAlgError: eig algorithm (geev) did not converge (only eigenvalues with order >= 2 have converged)

jlapeyre commented 2 years ago

@albertzhu01 , regarding your last example, where you call la.eigvals(mat): This also fails for me. But your examples calling decomposer.num_basis_gates do not fail on my machine.

How did you find mat in this last example? Did it arise when calling decomposer.num_basis_gates? If so, what was the input there?

In the stack trace, where do you see the call to eigvals? Is it always here?:

    Up = transform_to_magic_basis(U, reverse=True)
    # We only need the eigenvalues of `M2 = Up.T @ Up` here, not the full diagonalization.
    D = la.eigvals(Up.T @ Up)

albertzhu01 commented 2 years ago

Yes, mat arises for me when I call decompose.num_basis_gates on the following example:

[[(0.23662360703125174+0.9668410896909799j), 0j, (0.03464874399189194-0.08959375212849645j), 0j], 
 [(0.03464874399189194+0.08959375212849645j), 0j, (-0.23662360703125174+0.9668410896909799j), 0j], 
 [0j, (0.23662360703125174+0.9668410896909799j), 0j, (0.03464874399189194-0.08959375212849645j)], 
 [0j, (0.03464874399189194+0.08959375212849645j), 0j, (-0.23662360703125174+0.9668410896909799j)]]

And yes, that's where I always see the call to eigvals.

jlapeyre commented 2 years ago

It's still not working for me. This is what I see:

Just, to be clear, This example also fails on my machine with python and openblas. I reports the same error that you report.

I can run your most recent example and get the expected result, 3:

import qiskit
import numpy as np

unitary_bad = np.array([[(0.23662360703125174+0.9668410896909799j), 0j, (0.03464874399189194-0.08959375212849645j), 0j],
 [(0.03464874399189194+0.08959375212849645j), 0j, (-0.23662360703125174+0.9668410896909799j), 0j],
 [0j, (0.23662360703125174+0.9668410896909799j), 0j, (0.03464874399189194-0.08959375212849645j)],
 [0j, (0.03464874399189194+0.08959375212849645j), 0j, (-0.23662360703125174+0.9668410896909799j)]])

n_basis_gates = qiskit.quantum_info.two_qubit_cnot_decompose.num_basis_gates(unitary_bad)
print(n_basis_gates)

So the call to eigvals has not raised an error. What is the matrix that is passed to eigvals? It looks like the following code will give it to me.

from scipy import linalg as la
import qiskit
import numpy as np
from qiskit.quantum_info.synthesis.weyl import transform_to_magic_basis

unitary_bad = np.array([[(0.23662360703125174+0.9668410896909799j), 0j, (0.03464874399189194-0.08959375212849645j), 0j],
 [(0.03464874399189194+0.08959375212849645j), 0j, (-0.23662360703125174+0.9668410896909799j), 0j],
 [0j, (0.23662360703125174+0.9668410896909799j), 0j, (0.03464874399189194-0.08959375212849645j)],
 [0j, (0.03464874399189194+0.08959375212849645j), 0j, (-0.23662360703125174+0.9668410896909799j)]])

def prepare_for_eigvals(U):
    U = U / la.det(U) ** (0.25)
    Up = transform_to_magic_basis(U, reverse=True)
    return Up.T @ Up

result = prepare_for_eigvals(unitary_bad)
print(result)

But, the result printed is

[[-1.60267025e-16+1.00000000e+00j -9.37283027e-18-1.36181701e-17j
   2.01085786e-18+1.50590401e-18j  1.11730739e-18-1.02267756e-18j]
 [-9.37283027e-18-1.36181701e-17j -1.96188399e-16+1.00000000e+00j
   1.39257162e-18+1.70219515e-18j  1.60107009e-19-9.09467378e-19j]
 [ 2.01085786e-18+1.50590401e-18j  1.39257162e-18+1.70219515e-18j
  -1.57938885e-16+1.00000000e+00j  9.45852359e-18+1.35987928e-17j]
 [ 1.11730739e-18-1.02267756e-18j  1.60107009e-19-9.09467378e-19j
   9.45852359e-18+1.35987928e-17j -1.94216235e-16+1.00000000e+00j]]

Which is different from the bad matrix.

In fact la.eigvals(result) gives

array([-1.60267025e-16+1.j, -2.22044605e-16+1.j, -2.22044605e-16+1.j,
       -1.94289029e-16+1.j])

So, I'm still unable to get a failing example. We need to find one in order to fix the problem. And we need to put it in the test suite to prevent a regression in the future.

EDIT: I wrote "fix the problem". But, it may be more than one problem, as well.