[BUG] CUTENSORNET_STATUS_CUDA_ERROR when Setting lightning.tensor as Device

banosae commented 1 month ago

We are currently researching quantum simulation and discovered that PennyLane-Lightning-Tensor supports cutensornet as a backend. After installing lightning-tensor and running the example code provided below, we encountered the CUTENSORNET_STATUS_CUDA_ERROR. We would like to understand the cause of this error and how to resolve it.

Our GPU is an NVIDIA Titan Xp, and our CUDA version is:

nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2024 NVIDIA Corporation Built on Thu_Jun__6_02:18:23_PDT_2024 Cuda compilation tools, release 12.5, V12.5.82 Build cuda_12.5.r12.5/compiler.34385749_0

The output of qml.about() is as follows:

Platform info: Linux-5.15.0-112-generic-x86_64-with-glibc2.35 Python version: 3.9.19 Numpy version: 1.26.4 Scipy version: 1.13.1 Installed devices: default.clifford (PennyLane-0.37.0) default.gaussian (PennyLane-0.37.0) default.mixed (PennyLane-0.37.0) default.qubit (PennyLane-0.37.0) default.qubit.autograd (PennyLane-0.37.0) default.qubit.jax (PennyLane-0.37.0) default.qubit.legacy (PennyLane-0.37.0) default.qubit.tf (PennyLane-0.37.0) default.qubit.torch (PennyLane-0.37.0) default.qutrit (PennyLane-0.37.0) default.qutrit.mixed (PennyLane-0.37.0) default.tensor (PennyLane-0.37.0) null.qubit (PennyLane-0.37.0) lightning.qubit (PennyLane-Lightning-0.38.0.dev6) lightning.tensor (PennyLane-Lightning-Tensor-0.38.0.dev6)

Here is the example code we used:

import pennylane as qml from pennylane import numpy as np

num_qubits = 8 dev = qml.device("lightning.tensor", wires=num_qubits)

@qml.qnode(dev) def circuit(num_qubits): for qubit in range(0, num_qubits - 1): qml.CZ(wires=[qubit, qubit + 1]) qml.X(wires=[qubit]) qml.Z(wires=[qubit + 1]) return qml.expval(qml.Z(0))

print(circuit(num_qubits))

The error log is as follows:

Traceback (most recent call last): File "/home/cjkim/qml/ex9.py", line 19, in print(circuit(2)) File "/home/cjkim/miniconda3/envs/Tatis/lib/python3.9/site-packages/pennylane/workflow/qnode.py", line 1164, in call return self._impl_call(*args, *kwargs) File "/home/cjkim/miniconda3/envs/Tatis/lib/python3.9/site-packages/pennylane/workflow/qnode.py", line 1150, in _impl_call res = self._execution_component(args, kwargs, override_shots=override_shots) File "/home/cjkim/miniconda3/envs/Tatis/lib/python3.9/site-packages/pennylane/workflow/qnode.py", line 1103, in _execution_component res = qml.execute( File "/home/cjkim/miniconda3/envs/Tatis/lib/python3.9/site-packages/pennylane/workflow/execution.py", line 835, in execute results = ml_boundary_execute(tapes, execute_fn, jpc, device=device) File "/home/cjkim/miniconda3/envs/Tatis/lib/python3.9/site-packages/pennylane/workflow/interfaces/autograd.py", line 147, in autograd_execute return _execute(parameters, tuple(tapes), execute_fn, jpc) File "/home/cjkim/miniconda3/envs/Tatis/lib/python3.9/site-packages/autograd/tracer.py", line 48, in f_wrapped return f_raw(args, **kwargs) File "/home/cjkim/miniconda3/envs/Tatis/lib/python3.9/site-packages/pennylane/workflow/interfaces/autograd.py", line 168, in _execute return execute_fn(tapes) File "/home/cjkim/miniconda3/envs/Tatis/lib/python3.9/site-packages/pennylane/workflow/execution.py", line 316, in inner_execute results = device_execution(transformed_tapes) File "/home/cjkim/miniconda3/envs/Tatis/lib/python3.9/site-packages/pennylane/devices/modifiers/simulator_tracking.py", line 30, in execute results = untracked_execute(self, circuits, execution_config) File "/home/cjkim/miniconda3/envs/Tatis/lib/python3.9/site-packages/pennylane/devices/modifiers/single_tape_support.py", line 32, in execute results = batch_execute(self, circuits, execution_config) File "/home/cjkim/pennylane-lightning/pennylane_lightning/lightning_tensor/lightning_tensor.py", line 401, in execute results.append(simulate(circuit, self._tensornet())) File "/home/cjkim/pennylane-lightning/pennylane_lightning/lightning_tensor/lightning_tensor.py", line 158, in simulate tensornet.set_tensor_network(circuit) File "/home/cjkim/pennylane-lightning/pennylane_lightning/lightning_tensor/_tensornet.py", line 194, in set_tensor_network self._tensornet.appendMPSFinalState(self._cutoff, self._cutoff_mode) pennylane_lightning.lightning_tensor_ops.LightningException: [/home/cjkim/pennylane-lightning/pennylane_lightning/core/src/simulators/lightning_tensor/tncuda/TNCudaBase.hpp][Line:311][Method:computeState]: Error in PennyLane Lightning: CUTENSORNET_STATUS_CUDA_ERROR

Questions:

Could this error be caused by incompatibilities between our CUDA or cutensornet versions and lightning-tensor? If so, what versions should we be using to ensure compatibility? Alternatively, are there specific parts of our code that need modification?
If the issue is beyond our control, what steps should we take to resolve it?

Thank you for your assistance. We look forward to discussing this issue further with you.

multiphaseCFD commented 1 month ago

Thank you, @banosae, for reporting the issue. Unfortunately, lightning.tensor does not support Nvidia Titan Xp (SM6.1). It is designed to work with Nvidia GPUs SM7.0+ (Volta and later). I encourage you to try lightning.tensor on those GPUs instead. Feel free to reach out to us if you have any questions. I also recommend installing lightning.tensor in a Python virtual environment (venv).

banosae commented 1 month ago

Thank you for your assistance. We will try using our other GPU device, the NVIDIA A100, and will leave a comment on the results after attempting with it. Additionally, we are curious if there are any limitations when using lightning.tensor in a conda environment instead of a Python virtual environment (venv).

multiphaseCFD commented 1 month ago

Thank you, @banosae, for the update! Please proceed with testing on the NVIDIA A100, and we look forward to hearing about the results. All current development has been conducted in the Python venv environment, which is why I recommended users to utilize it. However, feel free to use Conda environments as well, and please inform us if you need any assistance.

banosae commented 1 month ago

We confirmed lightning.tensor runs correctly on another GPU, the NVIDIA V100. Thanks for your help.

Our team is working with the cuQuantum SDK. And I encountered an issue while trying to run lightning.tensor using cuTensorNet as the backend. It seems that instead of utilizing our custom-installed cuQuantum code, the PennyLane-Lightning is calling its internal TensorNet code, which is causing some difficulties.

Could you please advise if there is a way to get personal support or recommend someone who can assist with this specific issue? Alternatively, should we open a new issue to address this? Your continuous support is always appreciated. Thanks a lot!

CatalinaAlbornoz commented 1 month ago

Hi @banosae, thank you for your confirmation! Our team will get back to you with an answer early next week.

multiphaseCFD commented 1 month ago

Thank you, @banosae , for confirming that lightning.tensor runs correctly on the NVIDIA V100 GPU. It's great to hear about the successful integration.

Yes, PennyLane-Lightning calls our internal MPSTNCuda API, which encapsulates cuQuantum's high-level APIs. Per the issue with custom-installed cuQuantum code, I'd be happy to discuss/assist further in a new issue, which gives us more context about the issue.

Could you please provide additional details in the new issue:

Specific steps or configurations where the issue with custom-installed cuQuantum arises.
Any error messages or logs that might help diagnose the problem.
Information about the installation path of cuQuantum and how PennyLane-Lightning interfaces with it.
Other relevant information.

This will help us understand whether the issue is related to the installation path of cuQuantum or if there are other factors at play.

Thank you!

banosae commented 1 month ago

Thank you, @CatalinaAlbornoz and @multiphaseCFD, for your helpful responses! It's awesome to get advice from experts like you.

Our team is currently working on optimizing quantum simulations using cuQuantum-python with the cuTensorNet low-level API for tensor contraction. In our team, my role involves using cuTensorNet as the backend while also implementing quantum machine learning (e.g., linear regression) with PennyLane-Lightning. Thanks to your help, I was able to successfully install it.

We are now trying to integrate the optimizations we've made with cuQuantum into our QML work. However, I've encountered an issue where cuQuantum operates at a low level while PennyLane functions at a high level, which seems to create compatibility issues.

In particular, I'm curious about how the high-level abstraction of cuTensorNet in pennylane tensorizes and computes circuits. Additionally, I would like to know if it's possible to use cuTensorNet low-level code within PennyLane.

To explain the cuQuantum installation you asked about, we installed cuQuantum using the below method, and the code we optimized for quantum simulation is based on cuQuantum-python. If you have any questions regarding the pip list, conda list, or any other details, just let me know, I'll make sure to provide all the answers.

wget  https://github.com/NVIDIA/cuQuantum/archive/refs/tags/v24.03.0.tar.gz
tar  -xf  v24.03.0.tar.gz
cd cuQuantum-24.03.0/python
pip install -e .
cd cuQuantum-24.03.0/benchmark
pip install -e .

Thank you!

multiphaseCFD commented 1 month ago

Thanks @banosae for providing us more information!

We are now trying to integrate the optimizations we've made with cuQuantum into our QML work. However, I've encountered an issue where cuQuantum operates at a low level while PennyLane functions at a high level, which seems to create compatibility issues.

Agreed, lightning.tensor calls our internal MPSTNCuda API, which encapsulates cuQuantum C++ High-Level APIs instead of the cuQuantum-Python APIs. I think that's the source of incompatibility.

In particular, I'm curious about how the high-level abstraction of cuTensorNet in pennylane tensorizes and computes circuits. Additionally, I would like to know if it's possible to use cuTensorNet low-level code within PennyLane.

As of the latest release of cutensornet, high-level tensor network API functions centered around cutensornetState_t allow users to define complex tensor network states by gradually applying tensor operators (e.g., quantum gates) to the initial (vacuum) state residing in the user-defined direct-product space. This means a cutensornetState is initialized with a zero-state by default in the compute graph and then users can append quantum gates to the compute graph by applying tensor operators APIs. Once the graph is constructed, users can do the measurements. That's the way how does the high-level API work. For more information, please refer to nvidia docs.
As far as I know, Pennylane only supports cuquantum C++ high-level API via lightning.tensor and doesn't support low-level APIs for now.

Please note that lightning.tensor is under active development; please feel free to let us know your needs, suggestions, or any issues you encounter as we work towards enhancing its functionality and stability. Your feedback is valuable in shaping the future of this library.

banosae commented 1 month ago

Thank you, @multiphaseCFD, for your response. Apologies for the delayed reply. Your answer helped me explore other ways. It would be great to discuss further if there's an opportunity in the future. Thank you very much.

PennyLaneAI / pennylane-lightning