Operating system: Windows 10 Enterprise Version 21H2
What is happening?
The bug arises in the Qiskit QuantumKernel class (more precisely in the evaluation routine) when performing a regression task with the quantum kernel (SVR,...) and not using the statevector simulator.
It just happens if a non-symmetric kernel matrix is computed.
When some training- and testing points of the regression happen to be the same point the bug prevents the inclusion of the entries in the list of the "to_be_computed_data_pair"'s - which results in a wrong kernel entry of 0., and subsequently leading to the wrong regression result - since the kernel entry of two identical points should be 1.
How can we reproduce the issue?
Create a simple test function for the regression, where a kernel evaluated on a quantum computer is used.
# General Imports
import numpy as np
# Scikit Imports
from sklearn.preprocessing import StandardScaler
# Qiskit Imports
from qiskit import Aer
from qiskit.circuit import QuantumCircuit, ParameterVector
from qiskit_machine_learning.kernels import QuantumKernel
# backends
statevec = Aer.get_backend('statevector_simulator')
qasm = Aer.get_backend('qasm_simulator')
# Function to regress: in this example x*sin(x)
X = np.linspace(start=0, stop=10, num=1_00).reshape(-1, 1)
y = np.squeeze( X*np.sin(X) )
rng = np.random.RandomState(10)
training_indices = rng.choice(np.arange(y.size), size=20, replace=False)
training_indices = np.sort(training_indices)
X_train, y_train = X[training_indices], y[training_indices]
print(training_indices)
# in this case 20 of our training points gonna be similar to testing points
# Scaling of input feature
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
y_scaled = scaler.fit_transform(y.reshape(-1, 1))
X_train = X_scaled[training_indices]
y_train = y_scaled[training_indices]
# artificial feature expansion
# stack the training and test points to the number of qubits used in the encoding
XX = np.column_stack((X_scaled,X_scaled))
X4_test = np.column_stack((XX,XX))
XX_train = np.column_stack((X_train,X_train))
X4_train = np.column_stack((XX_train,XX_train))
# define data encoding (a custom one is used in this case):
#custom parameterized quantum circuit
def PQC(qubits,layers,c):
x = ParameterVector('x', length=qubits)
theta = ParameterVector('θ',length=2*qubits)
theta_range = np.linspace(start=0, stop=2*np.pi, num=2*qubits)
rand_idx = np.random.choice(np.arange(theta_range.size), size=2*qubits, replace=False)
theta_sample = theta_range[rand_idx]
var_custom = QuantumCircuit(qubits)
counter = 0
for j in range(layers):
for i in range(qubits):
if i != 0:
counter += 1
var_custom.ry(c*x[i] + theta[i+counter],i)
var_custom.rz(c*x[i] + theta[i+1+counter],i)
if i == qubits-1:
counter = 0
if (j % 2) == 0:
for i in range(qubits-1):
if(i % 2) == 0:
var_custom.cx(i, i+1)
else:
for i in range(qubits-1):
if(i % 2) == 1:
var_custom.cx(i, i+1)
var_custom = var_custom.bind_parameters({theta: theta_sample})
return var_custom
# use a feature map with 4 qubits to built up the quantum kernel (build two kernels - one for the statevector one with the qasm to compare the kernel entries and show the error):
pqc_map = PQC(qubits=4,layers=2,c=1.0)
quantum_kernel_statevec = QuantumKernel(feature_map=pqc_map, quantum_instance=statevec)
quantum_kernel_qasm = QuantumKernel(feature_map=pqc_map, quantum_instance=qasm)
# evaluate the non-symmetric test-training kernel matrix for both backends and compare the entries to see the error with the 0. entries (where there should be a 1.) in the kernel evaluated with the qasm simulator:
K_test_train_statevec = quantum_kernel_statevec.evaluate(x_vec=X4_test, y_vec=X4_train)
K_test_train_qasm = quantum_kernel_qasm.evaluate(x_vec=X4_test, y_vec=X4_train)
print(K_test_train_statevec)
print(K_test_train_qasm)
What should happen?
In the Qiskit QuantumKernel class there is a line in the code which prevents the inclusion of the data pair's where the data points have the same value (for noin-symmetric kernel matrices).
These have to be 1. (or with slightly tolerance around 1. if a noisy backend is used).
The following code is copied out of the QuantumKernel class (evaluate function) with a comment over the line where the bug lays (in my opinion):
else: # not using state vector simulator
feature_map_params_x = ParameterVector("par_x", self._feature_map.num_parameters)
feature_map_params_y = ParameterVector("par_y", self._feature_map.num_parameters)
parameterized_circuit = self.construct_circuit(
feature_map_params_x,
feature_map_params_y,
measurement=measurement,
is_statevector_sim=is_statevector_sim,
)
parameterized_circuit = self._quantum_instance.transpile(
parameterized_circuit, pass_manager=self._quantum_instance.unbound_pass_manager
)[0]
for idx in range(0, len(mus), self._batch_size):
to_be_computed_data_pair = []
to_be_computed_index = []
for sub_idx in range(idx, min(idx + self._batch_size, len(mus))):
i = mus[sub_idx]
j = nus[sub_idx]
x_i = x_vec[i]
y_j = y_vec[j]
######### here occurs the little bug for non-symmetric matrices #####
if not np.all(x_i == y_j):
to_be_computed_data_pair.append((x_i, y_j))
to_be_computed_index.append((i, j))
circuits = [
parameterized_circuit.assign_parameters(
{feature_map_params_x: x, feature_map_params_y: y}
)
for x, y in to_be_computed_data_pair
]]
Any suggestions?
Simply comment out the line if not np.all(x_i == y_j) in the evaluate function attached above.
Then all indices of non-symmetric matrices are included in the inner product computation and the regression method of choice is going to produce valid results with a quantum kernel.
Environment
What is happening?
The bug arises in the Qiskit QuantumKernel class (more precisely in the evaluation routine) when performing a regression task with the quantum kernel (SVR,...) and not using the statevector simulator. It just happens if a non-symmetric kernel matrix is computed. When some training- and testing points of the regression happen to be the same point the bug prevents the inclusion of the entries in the list of the "to_be_computed_data_pair"'s - which results in a wrong kernel entry of 0., and subsequently leading to the wrong regression result - since the kernel entry of two identical points should be 1.
How can we reproduce the issue?
Create a simple test function for the regression, where a kernel evaluated on a quantum computer is used.
What should happen?
In the Qiskit QuantumKernel class there is a line in the code which prevents the inclusion of the data pair's where the data points have the same value (for noin-symmetric kernel matrices). These have to be 1. (or with slightly tolerance around 1. if a noisy backend is used). The following code is copied out of the QuantumKernel class (evaluate function) with a comment over the line where the bug lays (in my opinion):
Any suggestions?
Simply comment out the line
if not np.all(x_i == y_j)
in the evaluate function attached above. Then all indices of non-symmetric matrices are included in the inner product computation and the regression method of choice is going to produce valid results with a quantum kernel.