gate modifiers as first class semantics

Say I build a circuit using the power and control modifier on gates.

from qiskit.circuit.library import *
from qiskit import *
from qiskit.quantum_info import random_unitary
from qiskit.extensions import UnitaryGate
from numpy import pi

θ = pi/8
U = random_unitary(2)

circuit = QuantumCircuit(3)
circuit.append(PhaseGate(θ).power(3), [0])
circuit.append(PhaseGate(θ).control(2), [0, 1, 2])
circuit.append(UnitaryGate(U).control(), [0, 2])

It is good that the circuit drawers support these modifiers natively:

     ┌─────┐
q_0: ┤ p^3 ├─■────────────■─────
     └─────┘ │            │
q_1: ────────■────────────┼─────
             │P(π/8) ┌────┴────┐
q_2: ────────■───────┤ Unitary ├
                     └─────────┘

The following issues exist:

I can't directly control the quantum_info.random_unitary U, have to build a UnitaryGate out of it first. i.e. quantum_info objects are not first class citizens in circuits.
PhaseGate(θ).control(2) yields a MCPhaseGate under the hood (that's why the circuit drawer knows to draw it like that). I'm not sure we need this MCPhaseGate class, since PhaseGate(θ).control(2) already has well-defined semantics. The synthesis routine should know that this is a diagonal gate and thus decompose it appropriately. The drawer should also know that controlled phase gates can be drawn symmetrically.
The decomposition gets greedily computed and stored in the gate itself when you modify a gate via power or control or inverse (https://github.com/Qiskit/qiskit-terra/blob/bdedeae59a5341b6ea4f399fe4a4f03c051e2e13/qiskit/circuit/library/standard_gates/p.py#L109). Instead we can keep them as powered- or controlled- or inverted- gates and make the rest of Qiskit work with that (e.g. quantum_info should be able to simulate this circuit at this high level without caring about the decomposition).
Similarly all gate optimization routines should work the same way even if the base gates happen to be controlled. For example optimize_1q_decomposition should work when we have ctrl-U and ctrl-V back-to-back (I did this in an old PR but never merged).

This issue is a perfect place to start the discussion on possible implementations of "lazy gate modifiers" that @jakelishman and I have been thinking about.

All comments are highly welcome.

Before describing possible implementations, here are a few things to keep in mind:

First, we would like the changes to be mostly transparent to the users, that is, we do not want any existing code to start breaking, yet we do want to automatically get most benefit from whatever the lazy gates have to offer. Personally, I don't know how to perfectly achieve both at the same time.

Second, we must make sure that the changes do not introduce bugs. At first, it seemed that the easiest implementation is to simply add extra fields (such as num_control_qubits and is_inverted) to either Gate, Instruction or Operation class, however now this idea seems absolutely terrible as most transpiler code would just silently ignore these fields, resulting in incorrect optimizations. In other words, we must make sure that every place in the transpiler that needs to handle lazy gate modifiers does so.

Third, we do not want to have lazy gates modifiers being part of the transpiler output, so somewhere in the transpiler flow we must stop procrastinating (i.e. remove the lazy modifiers).

In what follows I will refer to the preliminary code in https://github.com/alexanderivrii/qiskit-terra/pull/30.

The proposed solution is to create a new class LazyOp that inherits from Operation and that stores inverse, control and power modifiers (note: the current code in the linked PR only handles control and inverse for now). One immediate point is that it probably has to be an Operation and not just a Gate since we want to reason about lazy inverses of Cliffords (which are not Gates), and since it is Operation that is the basic object that can be appended to a quantum circuit. Another point is that the inverse, control, and power- modifiers commute, i.e. inverse-(control-U) = control-(inverse-U), so it's easy to store modifiers canonically, i.e. inverse-control(2)-inverse-control(3)-U is the same as control(5)-U. @jakelishman prefers a slightly more general implementation (like a list of modifers). The base_op tracks the "base gate", similarly to ControlledGate implementation.

A very important question is how to create circuits with lazy ops. In the preliminary code, I have added new methods lazy_inverse and lazy_control to Operation (and also lazy_control to QuantumCircuit). That is, any existing code should work exactly as before, nothing should break, yet there will be no benefit either. If we want more, then we should explicitly call lazy_inverse instead of inverse and lazy_control instead of control, and make sure to handle resulting circuits with lazy gates.

There is also a new UnrollLazy transpiler passes that removes the lazy gate modifers exactly the same way as Qiskit already handles inverse and controlled gates/circuits. For instance, given a quantum circuit qc, we should get the same quantum circuits with qc.control(2) and UnollLazy()(qc.lazy_control(2)). The procedure is recursive (like in all other transpiler passes), first handling gates in the definitions of complex custom gates, then proceeding to gates themselves. The UnrollLazy does not do anything if there are no lazy gates (in the recursive expansion). Worth to note that an important difference between ControlledGate and LazyOp is that some expansion in the controlled gate happen immediately when the gate is constructed (e.g., ControlledGate uses Unroller transpiler pass to remove recursive ControlledGates), while UnrollLazy does this during transpiler.

I have not added UnrollLazy to any of preset pass managers, in practice (since most of the transpiler passes don't know what to do with LazyOps), UnrollLazy should be one of the first passes to run on a circuit with lazy ops.

I have also added a preliminary optimization pass OptimizeLazy for circuits with lazy gates. My main motivating example (see code in test/python/transpiler/test_lazy_passes) is the controlled-qft-adder. The circuit is of the form control-[QFT -- C -- inverse-QFT], where C is some given subcircuit. The optimization suggested by @ajavadia allows to replace it by QFT -- control-C -- inverse-QFT, which in practice leads to significantly fewer gates after transpilation. If we build the controlled-qft adder above by calling lazy_control and lazy_inverse, the OptimizeLazy pass is able to optimize the circuit. (Note that LazyOp also has to_matrix method, allowing to construct Operators from quantum circuits with lazy gates and comparing the results). However, if we build the controlled-qft-adder as usual (using inverse and control), no optimization take place.

Another point is that we now need better methods to detect if a gate is equal to another gate, or is an inverse of another gate, in particular that LazyOp(G, num_control_qubits=0, inverted=True) is an inverse of G. I have added some (bad) preliminary code in qiskit.circuit.inverse. We also need a better way to check if two gates are equal.

Continuing the above discussion, as per @jakelishman's suggestion, I was also experimenting with what may go wrong if we replace all inverses by lazy_inverses and all controls by lazy_controls. (This significantly more hacky implementation is here https://github.com/alexanderivrii/qiskit-terra/pull/31).

The first thought is that if a gate implements its own inverse method (like the inverse of SGate is SdgGate or the inverse of PermutationGate is another PermutationGate), then we probably want to use this inverse method instead of creating a lazy gate. However, in order to unroll lazy gates, we still need to be able to call the "real" inverse method.

On the positive side, the above example with a controlled-QFT adder is optimized and transpiled without explicitly changing the adder implementation.

On the negative side, the change leads to a huge number of failing Qiskit tests, so this is probably not the direction we can follow in practice. In any case, let me describe some of the failure causes, in no particular order.

A huge number of "controlled gate" tests fail because now we have LazyOp instead of ControlledGate / explicit subclasses of ControlledGate (such as CCX) sitting in the circuit. Some tests check that the definition of a LazyOp matches the expected definition; I have added the definition method to LazyOp, however in retrospect I think that LazyOp should not have the definition method. We also have tests of the form self.assertEqual(gate.inverse().definition, gate_inverse.definition).

A huge number of visualization tests fail, since the output drawing contains a box that says "Lazy" instead of the the expecting drawing of CCX gate. These can be fixed by calling UnrollLazy before drawing. Similarly places that require UnrollLazy include qpy code.

Things like ZGate().control(3).c_if(cr, 1) do not work, because lazy ops don't support conditionals. I have not thought through if it's always safe to descend c_if to the base_op. But this I believe should be handled by @jakelishman developments on dynamic circuits.

We have tests that check that an assert is raised when one tries to compute the inverse of a quantum circuit with a measure or a conditional gate, the lazy inverse modifier does not trigger such asserts (at the time that inverse is added).

LazyOps do not have labels (I may have added these to reduce the number of failures, but I don't think we want this).

LazyOps do not have params. Interestingly, ControlledGate has params that point to those of its base gate. I am not sure this should be the case for LazyOps. But I am hoping that @jakelishman's work on decoupling params from the gate itself may take care of these issues.

The gate.decompose() method has to be changed to handle LazyGates, but that's easy.

Many tests fail because it's harder to detect that gate.inverse().control() is the same as gate.control().inverse() if gate.inverse() method is the gate's native inverse method, and gate.control().inverse() is implemented using modifiers. For instance, we need to reliably detect that LazyOp(SGate(), num_controls, inverted=True) is the same as LazyOp(SdgGate(), num_controls, inverted=False).

Many failing tests at the moment in test.python.algorithms because the circuit being built contains a lazy inverse (called from compute_uncompute code), but that happens later in the flow (the blueprint circuits are built using _build) and I was not able to figure out when it's best to call the UnrollLazy transpiler pass. One solution is to create "real" inverses, but this may hide possible optimizations.

Thanks for the detailed write up @alexanderivrii ! A few thoughts upon first reading, but I'm excited to discuss further!

Third, we do not want to have lazy gates modifiers being part of the transpiler output, so somewhere in the transpiler flow we must stop procrastinating (i.e. remove the lazy modifiers).

I don't necessarily disagree, but I wonder if this is this strictly true. Simulators can have efficient implementations of e.g. controlled gates which, right now, we handle with specific instruction types, but I don't immediately see why these new modifiers couldn't make it into transpiler output.

The proposed solution is to create a new class LazyOp that inherits from Operation and that stores inverse, control and power modifiers.

Minor point on naming, "lazy" might be confusing to users, as it's only really "lazy" from the perspective of the compiler. That is, this isn't an operation that is applied lazily (as in, scheduled as-late-as-possible). The central difference I think from a user perspective is that it is a gate that's specified by a behavior rather than an implementation, so maybe something along the lines of abstract/high-level/behavioral operation?

Another point is that the inverse, control, and power- modifiers commute, i.e. inverse-(control-U) = control-(inverse-U), so it's easy to store modifiers canonically, i.e. inverse-control(2)-inverse-control(3)-U is the same as control(5)-U

This is a nice property to have, but I think the best place for this canonicalization is within the transpiler (as opposed to on the circuit). For one, this will be analysis the transpiler will need to know anyway, but also, canonicalizing early prevents the circuit from being able to read back to the user if they built a ctrl-power-inverse or a power-inverse-ctrl.

The base_op tracks the "base gate", similarly to ControlledGate implementation.

Is the existing ControlledGate compatible with this proposal if its gate synthesis is handled in the transpiler rather than at construction, or would it have to be updated? Along the same lines, is there more you can say about the motivation to have a single class for all modifiers, as opposed to one class per modifier type? At a guess, I wouldn't expect many gates to have more than a handful of modifiers so I would expect the performance to be comparable, and the latter seems a little more explicit to me.

... explicitly call lazy_inverse instead of inverse ...

Would the long term plan be to keep both inverse and lazy_inverse indefinitely, or to remove the former in favor of the latter?

There is also a new UnrollLazy transpiler passes ...

Is this something that could be handled by the high-level synthesis pass? I would think the overall structure (and the overall ability to dispatch to more than one synthesis method) would be similar for these.

I have also added a preliminary optimization pass OptimizeLazy for circuits with lazy gates...

Nice! @ajavadia will be excited :). Is there a path to generalizing this? It seems like a general pattern for controlling any subcircuit where a gate-inverse pair wraps another gate, like control-[FOO -- BAR -- inverse-FOO] => FOO -- control-BAR -- inverse-FOO.

Another point is that we now need better methods to detect if a gate is equal to another gate...

This is also a good point. So far, we have relied upon Instruction.__eq__ to know if two Instructions are equal (here meaning they have the same representation), and Operation(...).{__eq__,equiv} to know if two Instructions (or subcircuits, ...) are equivalent (meaning they have the same effect). For identifying inverse pairs, for non-parameterized gates which can return their inverse type, it seems straightforward. For parameterized gates (like U2Gate), one option would be to have their inverse methods naturally return parameters e.g. mod 2pi as makes sense.

The first thought is that if a gate implements its own inverse method (like the inverse of SGate is SdgGate or the inverse of PermutationGate is another PermutationGate), then we probably want to use this inverse method instead of creating a lazy gate. However, in order to unroll lazy gates, we still need to be able to call the "real" inverse method.

Thanks for compiling the comprehensive list of things this would break :) . Some of these I think we will need to resolve regardless, but I wouldn't necessarily rely on this too heavily to guide how to define the interface. That is, if tests are failing because they're expecting different types, or e.g. checking for a .definition property we're not planning to support, it may be that the tests need updating, or worst case we need to define a deprecation and migration path, but that may be worthwhile if the end state is better than the alternative.

These are all great questions/suggestions, @kdk. I see multiple ways to improve my current implementation. I am now also convinced that it would be better to (1) keep a list of modifiers and have a "canonicalization" optimization for LazyOps; (2) merge UnrollLazy into HighLevelSynthesis. However, I don't have the answer to your main question on how exactly do we go to where we want to go from where we are now, especially in view of @jakelishman's work on decoupling gates from parameters. What is the best format to continue the discussions: here? on slack? some form of RFC?

On the one hand, I very much like the idea that the higher-level-synthesis pass should handle both abstract mathematical objects (e.g., LinearFunctions, Cliffords, PermutationGates) and lazily controlled gates (e.g., the QFT-circuit lazily controlled by 2 qubits), as this would allow the flexibility to choose the best synthesis method for a given object.

On the other hand, we have plans to extend higher-level-synthesis to support coupling map (and more generally Target), e.g. allowing to apply architecture-aware synthesis algorithms for LinearFunctions or Cliffords.

Here is where things get confusing. Suppose we have a LinearFunction lazily controlled by 2-qubits. The whole object is not a linear function, but can be synthesized by first synthesizing the "base" linear function using whichever linear function synthesis algorithm, then adding control to the synthesized circuit. This is still fine, but suppose that we also want to support the coupling map...

I am thinking that we can have a high-level-synthesis option like force_coupling_map. If this is False, then higher-level-synthesis is not required to adhere to the connectivity, it merely makes a best effort to do so -- this would be useful before layout/routing. If this is True, then the high-level-synthesis is required to adhere to the connectivity, raising an error if it cannot do so -- this would be useful for collecting and resynthesizing after layout/routing (and at that point we are not likely to have lazily controlled gates in the circuits).

Thoughts/suggestions/comments are welcome!

Qiskit / qiskit

gate modifiers as first class semantics #6879