NVIDIA / cuQuantum

Home for cuQuantum Python & NVIDIA cuQuantum SDK C++ samples
https://docs.nvidia.com/cuda/cuquantum/
BSD 3-Clause "New" or "Revised" License
320 stars 63 forks source link

Wrong sign in a single-gate-circuit statevector? #123

Closed yapolyak closed 3 months ago

yapolyak commented 3 months ago

Hi,

I stumbled upon a strange case of State object returning a seemingly wrong statevector on compute, for a two-qubit circuit with a single (single-qubit) gate. The gate in question is a parameterised Ry gate (such as described here, and I am anyway providing a specific unitary for it, so the nature of the gate should not matter.

First, I calculate statevector manually as follows:

import cupy as cp
import numpy as np

gate_ry = np.asarray([[ 0.33873792+0.j, -0.94088077+0.j],
                      [ 0.94088077+0.j,  0.33873792+0.j]], dtype='complex128')
I = np.array([[1. +0.j, 0. +0.j],[0. +0.j, 1. +0.j]])
mat = np.kron(gate_ry, I)  # So I put the Ry gate on the first qubit
sv_ini = np.asarray([1, 0, 0, 0])
mat @ sv_ini

and get the following result:

array([0.33873792+0.j, 0.        +0.j, 0.94088077+0.j, 0.        +0.j])

(I get the same result if I use cupy instead of numpy, and also with the pytket package).

Now, I try running the following script (as per your example, also discussed in another discussion previously with you):

import cupy as cp
import numpy as np

import cuquantum
from cuquantum import cutensornet as cutn

dev = cp.cuda.Device()  # get current device
props = cp.cuda.runtime.getDeviceProperties(dev.id)

num_qubits = 2
dim = 2
qubits_dims = (dim, ) * num_qubits

handle = cutn.create()
stream = cp.cuda.Stream()
data_type = cuquantum.cudaDataType.CUDA_C_64F

gate_ry = cp.asarray([[ 0.33873792+0.j, -0.94088077+0.j],
                      [ 0.94088077+0.j,  0.33873792+0.j]], dtype='complex128', order='F')
gate_ry_strides = 0

free_mem = dev.mem_info[0]
scratch_size = free_mem // 2
scratch_space = cp.cuda.alloc(scratch_size)

# Create the initial quantum state
quantum_state = cutn.create_state(handle, cutn.StatePurity.PURE, num_qubits, qubits_dims, data_type)
print("Created the initial quantum state")

# Construct the quantum circuit state with gate application
tensor_id = cutn.state_apply_tensor(  # here I am also applying the gate to the first (zeroth) qubit
        handle, quantum_state, 1, (0, ), 
        gate_ry.data.ptr, gate_ry_strides, 1, 0, 1)

# Configure the quantum circuit expectation value computation
num_hyper_samples_dtype = cutn.state_get_attribute_dtype(cutn.ExpectationAttribute.OPT_NUM_HYPER_SAMPLES)
num_hyper_samples = np.asarray(8, dtype=num_hyper_samples_dtype)
cutn.state_configure(handle, quantum_state, 
cutn.StateAttribute.NUM_HYPER_SAMPLES, 
num_hyper_samples.ctypes.data, num_hyper_samples.dtype.itemsize)

# Prepare the computation of the specified quantum circuit expectation value
work_desc = cutn.create_workspace_descriptor(handle)
cutn.state_prepare(handle, quantum_state, scratch_size, work_desc, stream.ptr)
print("Prepare the computation of the specified quantum circuit expectation value")

workspace_size_d = cutn.workspace_get_memory_size(handle, 
    work_desc, cutn.WorksizePref.RECOMMENDED, cutn.Memspace.DEVICE, cutn.WorkspaceKind.SCRATCH)

if workspace_size_d <= scratch_size:
    cutn.workspace_set_memory(handle, work_desc, cutn.Memspace.DEVICE, cutn.WorkspaceKind.SCRATCH, scratch_space.ptr, workspace_size_d)
else:
    print("Error:Insufficient workspace size on Device")
    cutn.destroy_workspace_descriptor(work_desc)
    cutn.destroy_state(quantum_state)
    cutn.destroy(handle)
    del scratch
    print("Free resource and exit.")

sv = cp.empty((2,) * 2, dtype="complex128", order="F")
statevec = cutn.state_compute(
            handle,
            quantum_state,
            work_desc,
            (sv.data.ptr, ), # note this should be a sequence and this sequence is on host
            stream.ptr,
)

sv.flatten()

...  # destroy stuff here

and I am getting the following vector:

array([ 0.33873792+0.j,  0.        +0.j, -0.94088077+0.j,  0.        +0.j])

As you can see, the third element sign is different to the reference result. It puzzles me a lot, because the same script (and a more structured API interface that I've done) give correct results for other types of gates.

Am I missing something, or is there a strange bug in State?

Many thanks! Iakov

yapolyak commented 3 months ago

Oh I've just managed to reproduce the result the State.compute() is giving me! If I transpose the unitary, I get the same result! But why is this the case, and should I transpose every unitary that goes into State?

yapolyak commented 3 months ago

I now recall that you do transpose Pauli unitaries in your example, and have been ignoring that... This probably is due to necessity of using 'F'-style ordering, right?

yapolyak commented 3 months ago

Another update - if I try and provide gate_ry strides explicitly (by using cupy.ndarray.strides() (accounted for the data type byte length) to cutn.state_apply_tensor(), it gives me a wrong result anyway, unless I transpose the array.

However, if I use order='C', don't transpose and Don't provide strides (pass 0) I get the right result. I think I understand why this would work (order='C' is doing the same thing as transposing here), but why if I provide explicit strides it fails? Specifically, I calculate the strides as follows:

gate_ry = cp.asarray([[ 0.33873792+0.j, -0.94088077+0.j],
                      [ 0.94088077+0.j,  0.33873792+0.j]]).astype(dtype='complex128', order='C')
gate_ry_strides = tuple(int(stride/16) for stride in gate_ry.strides)