NVIDIA / cuda-quantum

C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows
https://nvidia.github.io/cuda-quantum/
Other
429 stars 149 forks source link

[RFC] CircuitSimulator Refactor #11

Closed amccaskey closed 1 year ago

amccaskey commented 1 year ago

Background

The past month has given us a lot of feedback / requirements on the extensible NVQIR CircuitSimulator type. After implementing the MGPU and TensorNet backends, there are a few changes I'd like to propose for the CircuitSimulator type that will make it easier for new simulation contributions later on.

Gate API

CircuitSimulator currently enumerates pure-virtual methods for all operations defined in our MLIR dialects. There really is no need for this, as all simulation backends only require gate matrix data and control and target qubit indices. The first proposal here is to make these non-virtual (i.e. only implemented on CircuitSimulator) so that they need not be implemented by subtypes. But they will still remain for now, so as to minimize changes in the NVQIR driver code. The update will be to provide a protected pure virtual method, applyGate(const GateApplicationTask&) that subtypes will implement to affect evolution of the state in a manner specific to the sub-type simulation strategy. Here, GateApplicationTask will be a private struct that contains the matrix, controls, and targets.

We should also add a public method for invoking a custom quantum operation, i.e. one where we only have the matrix data, controls, and targets.

Mid-circuit measurement register naming

One bug that has arisen is the fact that the CircuitSimulator was only storing mid-circuit measurement data for circuits that had conditional statements on qubit measurements. This should be an easy thing to fix with an internal private GateApplicationTask queue. Here's the example that currently does not work as expected but will with the introduction of a queue.

auto qubits = entryPoint.qalloc(2);
entryPoint.x(qubits[0]);
entryPoint.mz(qubits[0], "c0");
entryPoint.x(qubits[1]);
entryPoint.mz(qubits[1], "c1");
auto counts = cudaq::sample(entryPoint);

The results here currently do not store the measurement results to c0, c1. By introducing a queue on the CircuitSimulator, and for each quantum operation invocation enqueuing that task, we give ourselves an opportunity to flush the queue at specific points in simulation, like the first mz call above. And at these flush points we are free to sample and persist the results according to the register name given by the programmer.

Handle One-Time Static Init (e.g. MPI)

Another issue that has arisen is in the case of a simulation strategy that can leverage MPI. In this case, we need to provide some kind of one-time initialization and finalization capability for MPI_Initialize() and MPI_Finalize() to run. There are a few subtleties here: this can only happen once, and in Python, you could envision someone calling set_qpu(...) multiple times targeting different MPI enabled backends.

A potential solution to this issue is to have MPI-enabled backends wrap MPI initialization and finalization in conditional statements that check if MPI has been initialized already.

Proposed CircuitSimulator Structure

Here we show the sub-type pertinent parts of the CircuitSimulator update.

using namespace cudaq;
namespace nvqir {
class CircuitSimulator {
public:
 ... public methods for NVQIR ... 
};

template<typename ScalarType>
class CircuitSimulatorBase : public CircuitSimulator {
  protected:
    struct GateApplicationTask {
      const std::vector<std::complex<ScalarType>> matrix;
      const std::vector<std::size_t> controls;
      const std::vector<std::size_t> targets;
      GateApplicationTask(const std::vector<std::complex<ScalarType>> m,
                        const std::vector<std::size_t> c,
                        const std::vector<std::size_t> t)
        : matrix(m), controls(c), targets(t) {}
    };

    std::queue<GateApplicationTask> gateQueue;

    /// This must be implemented by subclasses to evolve the state
    virtual void applyGate(const GateApplicationTask &task) = 0;

    /// Noise-capable simulators can apply all kraus channels defined in the 
    /// provided noise model on the given operation / qubits 
    virtual void applyNoiseChannel(const std::string_view gateName,
                                 const std::vector<std::size_t> &qubits) ;

    /// Increase the state by 1 qubit
    virtual void addQubitToState() = 0;
    /// Zero-out / clear the current state, takes state back to nQubits = 0
    virtual void resetQubitStateImpl() = 0;

    /// Subclass specific measurement of the specified qubit. 
    virtual bool measureQubit(const std::size_t qubitIdx) = 0;

  public:
    /// Public method for invoking a general, custom operation
    void applyCustomOperation(const std::vector<std::complex<ScalarType>>& matrix, 
                                                    const std::vector<std::size_t>& controls, 
                                                    const std::vector<std::size_t>& targets);

    /// Subclasses can implement spin_op observation
    virtual ExecutionResult observe(const cudaq::spin_op &term);

    /// Subclasses can implement state sampling
    virtual ExecutionResult
    sample(const std::vector<std::size_t> &qubitIdxs, const int shots) = 0;

    /// Return the name of this simulator
    virtual std::string name() const = 0;

    /// For the Python API, we need the ability to create a 
    /// a clone of a simulator we currently have a handle on.
    virtual CircuitSimulator *clone() = 0;
};
}
pranavdurai10 commented 1 year ago

Hi, @amccaskey. I'd like to work on this.

amccaskey commented 1 year ago

@pranavdurai10 This work has already been completed.