NVIDIA / cuda-quantum

C++ and Python support for the CUDA Quantum programming model for heterogeneous quantum-classical workflows
https://nvidia.github.io/cuda-quantum/
Other
515 stars 184 forks source link

Issue with MLIRGen and NVQ++ Linking #577

Open amccaskey opened 1 year ago

amccaskey commented 1 year ago

Take the phase_estimation.cpp example

struct r1PiGate {
  void operator()(cudaq::qubit &q) __qpu__ { r1(1., q); }
};

int main() {

  for (auto nQubits : std::vector<int>{2, 4, 6, 8}) {
    auto counts = cudaq::sample(
        qpe{}, nQubits, [](cudaq::qubit &q) __qpu__ { x(q); }, r1PiGate{});
    auto mostProbable = counts.most_probable();
    double theta = cudaq::to_integer(mostProbable) / (double)(1UL << nQubits);
    auto piEstimate = 1. / (2 * theta);
    printf("Pi Estimate(nQubits == %d) = %lf \n", nQubits, piEstimate);
  }
}

As is, if compiled targeting the RemoteRESTQPU, this works

nvq++ --target quantinuum --emulate phase_estimation.cpp
./a.out

However, if I just add a couple more kernel lambdas, even if unused,


struct r1PiGate {
  void operator()(cudaq::qubit &q) __qpu__ { r1(1., q); }
};

int main() {

  auto statePrep = [](cudaq::qubit &q) __qpu__ { x(q); };
  auto oracle = [](cudaq::qubit &q) __qpu__ { r1(1., q); };

  for (auto nQubits : std::vector<int>{2, 4, 6, 8}) {
    auto counts = cudaq::sample(
        qpe{}, nQubits, [](cudaq::qubit &q) __qpu__ { x(q); }, r1PiGate{});
    auto mostProbable = counts.most_probable();
    double theta = cudaq::to_integer(mostProbable) / (double)(1UL << nQubits);
    auto piEstimate = 1. / (2 * theta);
    printf("Pi Estimate(nQubits == %d) = %lf \n", nQubits, piEstimate);
  }
}

This fails because no counts come back due to the execution going through Library Mode and the ExecutionManager. The Quake gets generated, but there must be some issue with overriding the original entry point function. Perhaps the signatures are wrong somehow.

1tnguyen commented 1 year ago

Looks like the issue here is some discrepancy in lambda function mangling number b/w regular clang action (EmitLLVMAction) and the bridge action.

In particular, for the qpe invocation

_ZN3qpeclIZ4mainE3$_08r1PiGateEEviOT_OT0_

Demangled = void qpe::operator()<main::$_0, r1PiGate>(int, main::$_0&&, r1PiGate&&)

_ZN3qpeclIZ4mainE3$_28r1PiGateEEviOT_OT0_

Demangled = void qpe::operator()<main::$_2, r1PiGate>(int, main::$_2&&, r1PiGate&&)

The lambda [](cudaq::qubit &q) __qpu__ { x(q); } is named main::$_0 in the LLVM action vs. main::$_2 in our bridge.

I suspect there might be some optimization going on in the regular LLVM action: statePrep and oracle lambdas were skipped/ignored because they are unused; hence the x lambda got the first index 0.

schweitzpgi commented 11 months ago

Looks like the issue here is some discrepancy in lambda function mangling number b/w regular clang action (EmitLLVMAction) and the bridge action.

In particular, for the qpe invocation

  • EmitLLVMAction created this symbol:

_ZN3qpeclIZ4mainE3$_08r1PiGateEEviOT_OT0_

Demangled = void qpe::operator()<main::$_0, r1PiGate>(int, main::$_0&&, r1PiGate&&)

  • ASTBridgeAction created this symbol:

_ZN3qpeclIZ4mainE3$_28r1PiGateEEviOT_OT0_

Demangled = void qpe::operator()<main::$_2, r1PiGate>(int, main::$_2&&, r1PiGate&&)

The lambda [](cudaq::qubit &q) __qpu__ { x(q); } is named main::$_0 in the LLVM action vs. main::$_2 in our bridge.

I suspect there might be some optimization going on in the regular LLVM action: statePrep and oracle lambdas were skipped/ignored because they are unused; hence the x lambda got the first index 0.

Right. Unfortunately, the regular clang flow appears to be optimizing away the unused lambdas, which is resulting in the lambda's unique ids being renumbered. The MLIR bridge isn't doing this and the dead lambdas are still present, so the unique ids are different. sigh