Accelerate MPS simulator by using cuQuantum

Qiskit / qiskit-aer

Aer is a high performance simulator for quantum circuits that includes noise models

https://qiskit.github.io/qiskit-aer/

Apache License 2.0

500 stars 361 forks source link

Accelerate MPS simulator by using cuQuantum #2112

Open doichanj opened 6 months ago

doichanj commented 6 months ago

What is the expected behavior?

MPS simulation method (https://github.com/Qiskit/qiskit-aer/tree/main/src/simulators/matrix_product_state) becomes important to simulate circuits with large number of qubits. Currently MPS simulation method only supports device=CPU and it takes long time to simulate circuits with large number of qubits.

cuQuantum (https://developer.nvidia.com/cuquantum-sdk) is a SDK for Quantum computing that accelerates simulation on NVIDIA's GPUs. cuQuantum has APIs for statevector simulation (cuStateVec) and tensor network simulation (cuTensorNet) on GPUs, and Aer currently supports cuQuantum in method=statevector and method=tensor_network but we do not have for MPS method.

MPS method can be accelerated by using cuTensorNet and there are some code examples here https://docs.nvidia.com/cuda/cuquantum/latest/cutensornet/examples.html

MozammilQ commented 5 months ago

anyone trying this one? :)

Randl commented 5 months ago

So if I understand correctly, what is required is to implement additional set of MPS_Tensor methods https://github.com/Qiskit/qiskit-aer/blob/9bfd1105daa30debc1ce1773bf1d3c0457f2195d/src/simulators/matrix_product_state/matrix_product_state_tensor.hpp#L50 that utilize cuQuantum, similarly to, for example, tensor network contractor? I think it could be nice to discuss the details of architecture and design before starting tackling this issue.

For cuQuantum side, this example that uses cutensornetStateApplyTensorOperator and cutensornetStateFinalizeMPS probably should be a good place to start building the relevant routines.

doichanj commented 5 months ago

I think Decompose is one of the most time consuming kernel that can be accelerated by cuTensornet. https://github.com/Qiskit/qiskit-aer/blob/9bfd1105daa30debc1ce1773bf1d3c0457f2195d/src/simulators/matrix_product_state/matrix_product_state_tensor.hpp#L593-L614

There is a wrapper implementation of SVD by using LAPACK in svd.cpp I think we can add wrapper for GPU

MozammilQ commented 5 months ago

@Randl , I was working on this issue., and svd is my target thing since last night. But, its true that I have to learn cuQuantum, and your previous comment shows you have a better idea of cuQuantum. Now, who's going to do the PR. Waiting for you comment :) I will do as you say :)

Randl commented 5 months ago

Go on, feel free to work on it if you're already in progress.