PennyLaneAI / catalyst

A JIT compiler for hybrid quantum programs in PennyLane
https://docs.pennylane.ai/projects/catalyst
Apache License 2.0
138 stars 33 forks source link

[BUG] Compiled tutorial_qft.py with Qrack simulator #780

Closed WrathfulSpatula closed 4 months ago

WrathfulSpatula commented 5 months ago

Issue description

Description of the issue - include code snippets and screenshots here if relevant. You may use the following template below

The compiled pennylane-qrack back end on PennlaneAI/qml tutorial_qft.py should, in the third plot, show a probability peak on bit string 3.

The tutorial with the Qrack compiled back end produces a peak on bit string 1.

The bug always reproduces.

Name: PennyLane Version: 0.37.0.dev0 Summary: PennyLane is a cross-platform Python library for quantum computing, quantum machine learning, and quantum chemistry. Train a quantum computer the same way as a neural network. Home-page: https://github.com/PennyLaneAI/pennylane Author: Author-email: License: Apache License 2.0 Location: /home/iamu/qrack_venv/lib/python3.12/site-packages Requires: appdirs, autograd, autoray, cachetools, networkx, numpy, pennylane-lightning, requests, rustworkx, scipy, semantic-version, toml, typing-extensions Required-by: amazon-braket-pennylane-plugin, PennyLane-Catalyst, pennylane-qrack, PennyLane_Lightning, PennyLane_Lightning_Kokkos

Platform info: Linux-6.8.0-31-generic-x86_64-with-glibc2.39 Python version: 3.12.3 Numpy version: 1.26.4 Scipy version: 1.12.0 Installed devices:

Source code and tracebacks

pennylane-qrack tutorial_qft.py

From the tutorial, this is the section that seems to run into an issue:

def prep():
    """quantum function that prepares the state."""
    qml.PauliX(wires=0)
    for wire in range(1, 6):
        qml.Hadamard(wires=wire)
    qml.ControlledSequence(qml.PhaseShift(-2 * np.pi / 10, wires=0), control=range(1, 6))
    qml.PauliX(wires=0)

Additional information

Obviously, the Qrack back end is still under development and might be bugged. However, in the course of debugging, I thought to output the gates that are actually passed into the back end by catalyst. In NamedOperation(), in the C++ back end implementation, I added std::cout statements to print gate name, control wires, control values, target wires, and floating-point parameters. The reason I open this issue ticket is that the compiled gates sent to the back end appear suspicious, such that I'm starting to suspect the bug isn't the Qrack simulator back end.

I understand that the Catalyst compiler might reorder, compose, and decompose gates, in an attempt to improve performance. For example, even though the first PauliX gate is dispatched first in user code, it makes perfect sense that it could be moved after the series of Hadamard gates, in the snippet above.

However, there appear to be many repeating controlled phase gates on the same controls and targets with the same repeated parameter value. For the state preparation portion of the circuit, there should only be 5 phase gates, unless they're being reordered to include gates from the QFT decomposition. Naively, I expect that most compilers would combine these variational gates into a single gate, though Catalyst might have a slightly different strategy. However, there appear to be way many more controlled phase gates coming from the output of your compiler passes than are actually just simply dispatched by user code, and these repeating gates still don't seem right. *I'm wondering if the bug is in how Catalyst compiles for the `.toml` of the Qrack back end.**

The (non-Catalyst) PyQrack back end results, alternatively, match those of your Lightning simulator, whether the Lightning simulator is compiled with Catalyst, either way.

This is an example of the output of the std::cout statements, below. By line of output, each gate includes name, followed by control wires, control wires values, target wires, and then floating-point parameters.

DFT matrix for n = 2:

[[ 0.5+0.j 0.5+0.j 0.5+0.j 0.5+0.j ] [ 0.5+0.j 0. -0.5j -0.5-0.j -0. +0.5j] [ 0.5+0.j -0.5-0.j 0.5+0.j -0.5-0.j ] [ 0.5+0.j -0. +0.5j -0.5-0.j 0. -0.5j]]

inverse QFT matrix for n = 2:

[[ 0.5-0.j 0.5-0.j 0.5-0.j 0.5-0.j ] [ 0.5-0.j 0. -0.5j -0.5-0.j 0. +0.5j] [ 0.5-0.j -0.5-0.j 0.5-0.j -0.5-0.j ] [ 0.5-0.j 0. +0.5j -0.5-0.j 0. -0.5j]] Device #0, Loaded binary from: /home/iamu/.qrack/qrack_ocl_dev_NVIDIA_GeForce_RTX_3080_Laptop_GPU.ir Default platform: NVIDIA CUDA Default device: #0, NVIDIA GeForce RTX 3080 Laptop GPU OpenCL device #0: NVIDIA GeForce RTX 3080 Laptop GPU Hadamard

6,

Hadamard

5,

Hadamard

4,

Hadamard

3,

Hadamard

2,

PauliX

1,

ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 4, true, 1, -0.628319, ControlledPhaseShift 4, true, 1, -0.628319, ControlledPhaseShift 4, true, 1, -0.628319, ControlledPhaseShift 4, true, 1, -0.628319, ControlledPhaseShift 5, true, 1, -0.628319, ControlledPhaseShift 5, true, 1, -0.628319, ControlledPhaseShift 6, true, 1, -0.628319, PauliX

1,

Hadamard

6,

Hadamard

5,

Hadamard

4,

Hadamard

3,

Hadamard

2,

PauliX

1,

ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 2, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 3, true, 1, -0.628319, ControlledPhaseShift 4, true, 1, -0.628319, ControlledPhaseShift 4, true, 1, -0.628319, ControlledPhaseShift 4, true, 1, -0.628319, ControlledPhaseShift 4, true, 1, -0.628319, ControlledPhaseShift 5, true, 1, -0.628319, ControlledPhaseShift 5, true, 1, -0.628319, ControlledPhaseShift 6, true, 1, -0.628319, PauliX

1,

Hadamard

2,

ControlledPhaseShift 3, true, 2, 1.5708, ControlledPhaseShift 4, true, 2, 0.785398, ControlledPhaseShift 5, true, 2, 0.392699, ControlledPhaseShift 6, true, 2, 0.19635, Hadamard

3,

ControlledPhaseShift 4, true, 3, 1.5708, ControlledPhaseShift 5, true, 3, 0.785398, ControlledPhaseShift 6, true, 3, 0.392699, Hadamard

4,

ControlledPhaseShift 5, true, 4, 1.5708, ControlledPhaseShift 6, true, 4, 0.785398, Hadamard

5,

ControlledPhaseShift 6, true, 5, 1.5708, Hadamard

6,

SWAP

2, 6,

SWAP

3, 5,

WrathfulSpatula commented 5 months ago

I have an update, as I work on the Qrack device. I just completed support for parsing general wire maps. The version of the Qrack device that I'm using is here: https://github.com/unitaryfund/pennylane-qrack/pull/4

In the same tutorial_qft.py, it turns out that the very first gate dispatched (which happens to be a Hadamard) is out-of-range. If I specify a wires argument of 6 to the device, the first Hadamard is dispatched on wire 11, which doesn't exist. If I change the default behavior such that 1 qubit is allocated by default if a wires argument isn't supplied, which seems to be the expected default behavior, and if I don't supply a wires argument, then the first Hadamard qubit label overflows to seemingly "random" high numbers, likely from subtracting past 0 label.

(Is there something I'm missing about how Catalyst expects devices to handle wrap-around on out-of-bounds labels, or is this genuinely a bug?)

dime10 commented 5 months ago

Hi Dan, let me look into this it may be a genuine bug. Thanks for reporting it!

WrathfulSpatula commented 5 months ago

Disregard my additional comment above, by the way, but the original issue remains. The secondary issue was that I didn't realize the onus was on me not to double-allocate when parsing a wires argument in the C++ back end, but that part is sorted.

The best implementation of the Qrack device I have is here: https://github.com/unitaryfund/pennylane-qrack/tree/wires_and_endianness

The issue remains that the PyQrack back end matches the Lightning back end, for output from tutorial_qft.py, while the C++ Qrack Catalyst back end does not. I think some attention was needed to endianness, but the output peaks on Hilibert space dimension, unsigned integer 3, in the last plot in the tutorial, but the reason for this seems to be the suspicious phase gates.

It's simple to pipe PennyLane gates to std::cout, to see what Catalyst is dispatching to the back end. I'm happy to provide a concrete example.

WrathfulSpatula commented 5 months ago

@dime10 You can rest easy: while the controlled phase gates seem less than optimal, the remaining bug was in my (confused) endianness conventions for outputs. I have fixed the issue. As far as I'm concerned, this ticket is closed by your current PR, waiting for review.

dime10 commented 5 months ago

@dime10 You can rest easy: while the controlled phase gates seem less than optimal, the remaining bug was in my (confused) endianness conventions for outputs. I have fixed the issue. As far as I'm concerned, this ticket is closed by your current PR, waiting for review.

That's great to hear!

I would still like to make sure that the phase gates you are concerned about are supposed to be there. They might very well come from the QFT decomposition (we can see that its implemented as Hadamards and ControlledPhaseGates), but I thought the Qrack device supports the QFT operation natively? If so I wouldn't expect it to be decomposed.

Naively, I expect that most compilers would combine these variational gates into a single gate, though Catalyst might have a slightly different strategy.

Peephole optimizations on quantum circuits (and other circuit optimization techniques) are actually not well built-out yet in Catalyst, although we do plan to build up a library of such optimizations over time. So this is not too surprising at the moment.

WrathfulSpatula commented 5 months ago

I would still like to make sure that the phase gates you are concerned about are supposed to be there. They might very well come from the QFT decomposition (we can see that its implemented as Hadamards and ControlledPhaseGates), but I thought the Qrack device supports the QFT operation natively? If so I wouldn't expect it to be decomposed.

Qrack could execute the QFT natively, but I switched it over to a decomposition in the manifest to debug other issues with this and other PennyLane tutorials. Once I got to the bottom of all my bugs, I considered putting the native QFT back, but I figure it's actually better to let Catalyst decompose it into smaller gates with which the compiler can work. So, for now, the QFT is decomposed by Catalyst.