optimize_for_target_gateset returns a non-optimal circuit

gauravgyawali commented 9 months ago

Using cirq.optimize_for_target_gateset(circuit, gateset=cirq.CZTargetGateset()) returns a circuit with more moments. First, it doesn't combine the multiple single qubit rotations into a single PhasedXZGate. Second, it distributes the operations of an optimal moment to make it non-optimal

How to reproduce the issue


circ = cirq.Circuit([cirq.Moment(
     cirq.X(cirq.LineQubit(1)),
     cirq.X(cirq.LineQubit(2)),
     cirq.X(cirq.LineQubit(3)),
     cirq.X(cirq.LineQubit(6)),
 ),
 cirq.Moment(
     cirq.H(cirq.LineQubit(0)),
     cirq.H(cirq.LineQubit(1)),
     cirq.H(cirq.LineQubit(2)),
     cirq.H(cirq.LineQubit(3)),
     cirq.H(cirq.LineQubit(4)),
     cirq.H(cirq.LineQubit(5)),
     cirq.H(cirq.LineQubit(6)),
 ),
 cirq.Moment(
     cirq.H(cirq.LineQubit(1)),
     cirq.H(cirq.LineQubit(3)),
     cirq.H(cirq.LineQubit(5)),
 ),
 cirq.Moment(
     cirq.CZ(cirq.LineQubit(0), cirq.LineQubit(1)),
     cirq.CZ(cirq.LineQubit(2), cirq.LineQubit(3)),
     cirq.CZ(cirq.LineQubit(4), cirq.LineQubit(5)),
 ),
 cirq.Moment(
     cirq.CZ(cirq.LineQubit(2), cirq.LineQubit(1)),
     cirq.CZ(cirq.LineQubit(4), cirq.LineQubit(3)),
     cirq.CZ(cirq.LineQubit(6), cirq.LineQubit(5)))])
circ
print(circ)
print("number of moments = ", len(circ),"\n")
circ_opt = cirq.optimize_for_target_gateset(circuit=circ,gateset=gateset)
print(circ_opt)
print("number of moments = ", len(circ_opt))

0: ───────H───────@───────
                  │
1: ───X───H───H───@───@───
                      │
2: ───X───H───────@───@───
                  │
3: ───X───H───H───@───@───
                      │
4: ───────H───────@───@───
                  │
5: ───────H───H───@───@───
                      │
6: ───X───H───────────@───
number of moments =  5 

0: ───PhXZ(a=-0.5,x=0.5,z=-1)────────────────────────────@────────────────────────────────────────────────────
                                                         │
1: ───PhXZ(a=-0.5,x=0.5,z=0)────PhXZ(a=0.5,x=-0.5,z=1)───@───@────────────────────────────────────────────────
                                                             │
2: ───PhXZ(a=-0.5,x=0.5,z=0)─────────────────────────────@───@────────────────────────────────────────────────
                                                         │
3: ───PhXZ(a=-0.5,x=0.5,z=0)────PhXZ(a=0.5,x=-0.5,z=1)───@───@────────────────────────────────────────────────
                                                             │
4: ───PhXZ(a=-0.5,x=0.5,z=-1)───@────────────────────────────@────────────────────────────────────────────────
                                │
5: ─────────────────────────────@─────────────────────────────────────────────────────────────────────────@───
                                                                                                          │
6: ──────────────────────────────────────────────────────────PhXZ(a=0,x=1,z=0)───PhXZ(a=0.5,x=-0.5,z=1)───@───
number of moments =  6

I find that using cirq.merge_single_qubit_moments_to_phxz(circ) first fixes the issue.

Cirq version 1.2.0.dev20230628175157

eliottrosenberg commented 9 months ago

+1 There should be some check that it isn't increasing the number of moments.

tanujkhattar commented 9 months ago

Explanation of what's happening in this example

So, the idea behind optimize_for_target_gateset transformer is to provide users with a way to bundle a bunch of transformers that they wish to run, based on the specific context, in order to optimize circuits for a specific target gateset. The CompilationGateset class provides a way for users to specify this group of transformers and the optimize_for_target_gateset method calls the transformers from this group one after the other.

In this specific case, the "group" transformers to run is encapsulated in the CZTargetGateset, and the list is as follows:

cirq.expand_composite - Expands composite gates acting on more than 2 qubits (this is not relevant for this issue and thus has not effect). After step-1, all operations in the circuit are expected to be 1/2 qubit gates.

cirq.merge_k_qubit_unitaries - Merges all "mergeable" sets of unitaries that act on 1 and 2 qubits. This step results in the following circuit for the specific use case:


  [ 0: ───H───────────@─── ]
0: ───[                   │    ]──────────────────────────────────────────────────────────────────────────────────────────────────────
  [ 1: ───X───H───H───@─── ]['_default_merged_k_qubit_unitaries']
  │
  │                                                                 [ 1: ───@─── ]
1: ───#2────────────────────────────────────────────────────────────────[       │    ]────────────────────────────────────────────────
                                                                    [ 2: ───@─── ]['_default_merged_k_qubit_unitaries']
                                                                    │
  [ 2: ───X───H───────@─── ]                                        │
2: ───[                   │    ]────────────────────────────────────────#2────────────────────────────────────────────────────────────
  [ 3: ───X───H───H───@─── ]['_default_merged_k_qubit_unitaries']
  │
  │                                                                 [ 3: ───@─── ]
3: ───#2────────────────────────────────────────────────────────────────[       │    ]────────────────────────────────────────────────
                                                                    [ 4: ───@─── ]['_default_merged_k_qubit_unitaries']
                                                                    │
  [ 4: ───H───────@─── ]                                            │
4: ───[               │    ]────────────────────────────────────────────#2────────────────────────────────────────────────────────────
  [ 5: ───H───H───@─── ]['_default_merged_k_qubit_unitaries']
  │
  │                                                                 [ 5: ───────────@─── ]
5: ───#2────────────────────────────────────────────────────────────────[               │    ]────────────────────────────────────────
                                                                    [ 6: ───X───H───@─── ]['_default_merged_k_qubit_unitaries']
                                                                    │
6: ─────────────────────────────────────────────────────────────────────#2────────────────────────────────────────────────────────────


As you can see, this is not the desired behavior for this use case because ideally, we'd like to merge only the moments with single qubit gates and leve alone the two qubit gate moments (that already contain CZs and thus don't need to be decomposed at all). 

Steps 1 & 2 above are part of "preprocessing" transformers for the `CZTargetGateset`. Once these two steps are executed, we know the circuit contains only 1/2q operations and each of these can potentially represent a connected component of 1/2q operations from the original circuit. Step-3 decomposes these 1/2q operaitons using a KAK based analytical decomposition into layers of 1 & 2q gates. 

3. In this step, we decompose each of the connected components into alternating layers of 1q/2q gates using an analytical decomposition. If the new optree obtained via an analytical decomposition is not strictly better (i.e. uses lesser number of 2q operations) then we just fallback to using the original optree (in this case, we just expand each merged connected component back to it's original form, but this leads to destruction of the moment structure because x & h on qubit 6 get merged with CZ on the last layer).  

After step-3, all operations should be converted to the target gateset but hopefully with a shorter depth (this is best effort, and clearly not happening here). We then run post-processing transformers to allow obvious cleanups like

4. `cirq.merge_single_qubit_moments_to_phxz` - to merge adjacent single moments containing only single qubit gates into a single moment. This does merge the 1st & 2nd moment in our case but the 3rd moment is no more a single qubit moment now so it doesn't work as expected. 

5. `cirq.drop_negligible_operations` and `cirq.drop_empty_moments` - to drop any redundant operations / moments from the circuit 

--------------------- 

## Discussion on potential workarounds to fix the behavior 

1. The right way to fix the behavior for the specific use case is to implement your own version of `cirq.CZTargetGateset` and specify a list of `preprocessing_transformers` and `postprocessing_transformers` which makes sense for your use case. For example, if you know you always want your 2q layers to be preserved but mainly want adjacent single qubit layers to be merged then just specify `cirq.merge_single_qubit_moments_to_phxz` as a preprocessing transformer and the output should be as expected. For example: 
```python
import cirq
from typing import List

class CustomCZTargetGateset(cirq.CZTargetGateset):
    @property
    def preprocess_transformers(self) -> List['cirq.TRANSFORMER']:
        """List of transformers which should be run before decomposing individual operations."""
        return [cirq.merge_single_qubit_moments_to_phxz]

    @property
    def postprocess_transformers(self) -> List['cirq.TRANSFORMER']:
        """List of transformers which should be run after decomposing individual operations."""
        return [
            cirq.drop_negligible_operations,
            cirq.drop_empty_moments,
        ]
cirq.optimize_for_target_gateset(circuit=circ,gateset=CustomCZTargetGateset())

Results in the following ouptut as expected.

0: ───PhXZ(a=-0.5,x=0.5,z=-1)───@───────
                                │
1: ───PhXZ(a=-1,x=1,z=0)────────@───@───
                                    │
2: ───PhXZ(a=-0.5,x=0.5,z=0)────@───@───
                                │
3: ───PhXZ(a=-1,x=1,z=0)────────@───@───
                                    │
4: ───PhXZ(a=-0.5,x=0.5,z=-1)───@───@───
                                │
5: ─────────────────────────────@───@───
                                    │
6: ───PhXZ(a=-0.5,x=0.5,z=0)────────@───

This should be the preferred approach because the goal of the entire Cirq transformers infrastructure was to provide more control to users for their specific compilation requirements since it's hard to capture all the corner cases as part of a generic compiler (preserve moment structure vs reduce depth etc.)

If you think there's no simple structure in your example and you'd like something more generic that works well for most cases, we can find a way to fix the behavior of the default CZTargetGateset maybe by adding a repetitions parameter. But I really don't think we should destroy the moment structure by default as we are doing in https://github.com/quantumlib/Cirq/pull/6426#discussion_r1468154727

@eliottrosenberg @gauravgyawali @NoureldinYosri Hope this provides more context on the intention behind the original design. Let me know your thoughts.

eliottrosenberg commented 9 months ago

What if we just changed cirq.CZTargetGateset to run merge_single_qubit_moments_to_phxz before merge_k_qubit_unitaries? Would that improve the resullt for this case?

gauravgyawali commented 9 months ago

I think it would be helpful from the perspective of device implementation to be able to optimize the number of moments by calling this function. The optimized moment structure can make a big difference in the quality of data. Perhaps, we can have a keyword that lets you turn off the default moment structure preservation?

NoureldinYosri commented 9 months ago

I think two parameters should be introduced max_num_passes and preserve_moment_structure: bool=True together they will make the method for flexiable to user needs.

ikd-sci commented 9 months ago

Somewhat related -- a simple circuit like this:

circuit = cirq.Circuit([
     cirq.CNOT(cirq.LineQubit(0), cirq.LineQubit(1)),
     cirq.CNOT(cirq.LineQubit(1), cirq.LineQubit(2))])
print(circuit)
gateset = cirq.CZTargetGateset(allow_partial_czs=False)
circuit2 = cirq.optimize_for_target_gateset(circuit, gateset=gateset)
print(circuit2)
print(circuit2.moments)

Returns cirq.CZ**-1.0 in the returned moments:

0: ───@───────
      │
1: ───X───@───
          │
2: ───────X───
0: ────────────────────────────@────────────────────────────────────────────────────────
                               │
1: ───PhXZ(a=0.5,x=-0.5,z=0)───@───PhXZ(a=0.5,x=0.5,z=0)────@───────────────────────────
                                                            │
2: ────────────────────────────────PhXZ(a=0.5,x=-0.5,z=0)───@───PhXZ(a=0.5,x=0.5,z=0)───
[cirq.Moment(
    cirq.PhasedXZGate(axis_phase_exponent=0.5, x_exponent=-0.5, z_exponent=0.0).on(cirq.LineQubit(1)),
), cirq.Moment(
    (cirq.CZ**-1.0).on(cirq.LineQubit(0), cirq.LineQubit(1)),
), cirq.Moment(
    cirq.PhasedXZGate(axis_phase_exponent=0.5, x_exponent=0.5, z_exponent=0.0).on(cirq.LineQubit(1)),
    cirq.PhasedXZGate(axis_phase_exponent=0.5, x_exponent=-0.5, z_exponent=0.0).on(cirq.LineQubit(2)),
), cirq.Moment(
    (cirq.CZ**-1.0).on(cirq.LineQubit(1), cirq.LineQubit(2)),
), cirq.Moment(
    cirq.PhasedXZGate(axis_phase_exponent=0.5, x_exponent=0.5, z_exponent=0.0).on(cirq.LineQubit(2)),
)]

NoureldinYosri commented 9 months ago

@ikd-sci ~~I don't think that is a CZ**-1.~~ The method decomposes CX(0, 1) as H(1) CZ(0, 1) H(1).

update: I was looking at the diagram rather than the moment. The decomposer should have used CZ rather than CZ**-1 so I will findout why it didn't.

NoureldinYosri commented 9 months ago

@tanujkhattar @eliottrosenberg @gauravgyawali I updated https://github.com/quantumlib/Cirq/pull/6426 to have an argument for preserving the moment structure. The desired result for the circuit in the issue is then obtained by calling the method with preserve_moment_structure=False and max_num_passes with a value >= 2 or None.

Alternatively we can discard the PR and implement a custom gateset as @tanujkhattar suggested. however I do think that at least the max_num_passes should be added since no matter what preprocessors and postprocessors we use, any sufficiently large/complex circuit will benifit from running the optimizer a couple of times

tanujkhattar commented 9 months ago

@NoureldinYosri IMO, it's better to add circuit = circuits.Circuit(op for op in circuit.all_operations()) as an optional postprocessing transformer than adding it as a flag preserve_moment_structure.

Regarding max_num_passes, that sounds like a reasonable addition with a default max_num_passes=1

eliottrosenberg commented 9 months ago

@ikd-sci I don't think that is a CZ**-1. The method decomposes CX(0, 1) as H(1) CZ(0, 1) H(1).

@NoureldinYosri I just tried Ilya's example, and indeed, it produced CZ**-1. Could you please look into it? Thanks.

NoureldinYosri commented 9 months ago

@eliottrosenberg @ikd-sci sorry, I was looking at the diagram rather than the moment. The decomposer should have used CZ rather than CZ**-1 so I will findout why it didn't.

NoureldinYosri commented 9 months ago

looking deeper into why it uses CZ**-1 rather than CZ. it looks like that initially it uses CZ as expected and produces the circuit

 Z**0.75(q(0))
 X**-0.25(q(1))
 X**0.5000000000000001(q(0))
 Y**-0.5(q(1))
 S**-1(q(0))
 Y**-0.5(q(0))
 CZ(q(0), q(1))
 S**-1(q(0))
 S**-1(q(1))
 Y**0.5(q(0))
 Y**0.5(q(1))
 Y**0.5(q(0))
 X**-0.24999999999999997(q(1))
 Z**-0.75(q(0))

but because we compress operations clean_operations=True it calls several other methods to compress the circuit. at one point it has the circuit (which still has CZ)

 PhX(-0.5)**0.5(q(1))
 PhX(0.375)(q(0))
 T**-1(q(1))
 CZ(q(0), q(1))
 PhX(-0.75)**0.5(q(1))
 PhX(-0.625)(q(0))
 Z**-0.75(q(1))

on which it calls cirq.transformers.eject_phased_paulis.eject_phased_paulis which result in

 PhX(-0.5)**0.5(q(1))
 T**-1(q(1))
 Z(q(1))
 CZ**-1.0(q(0), q(1))
 PhX(-0.75)**0.5(q(1))
 Z**-0.75(q(1))

I'm not sure whether this is a bug or a byproduct of the equivalence between CZ and CZ**-1. I filed https://github.com/quantumlib/Cirq/issues/6428 to track invistigating that.

NoureldinYosri commented 9 months ago

@gauravgyawali @ikd-sci @eliottrosenberg After the recent PRs, the desired behaviour for the circuit in the desription can be optained by

>>> gateset = cirq.CZTargetGateset(preserve_moment_structure=False)
>>> print(cirq.optimize_for_target_gateset(circuit=circ, gateset=gateset, max_num_passes=2))
0: ───PhXZ(a=0.5,x=-0.5,z=1)───@───────
                               │
1: ───PhXZ(a=-1,x=1,z=0)───────@───@───
                                   │
2: ───PhXZ(a=-0.5,x=0.5,z=0)───@───@───
                               │
3: ───PhXZ(a=-1,x=1,z=0)───────@───@───
                                   │
4: ───PhXZ(a=0.5,x=-0.5,z=1)───@───@───
                               │
5: ────────────────────────────@───@───
                                   │
6: ───PhXZ(a=-0.5,x=0.5,z=0)───────@───

Also the issue in https://github.com/quantumlib/Cirq/issues/6422#issuecomment-1913798978 is now fixed

>>> circuit = cirq.Circuit([
...      cirq.CNOT(cirq.LineQubit(0), cirq.LineQubit(1)),
...      cirq.CNOT(cirq.LineQubit(1), cirq.LineQubit(2))])
>>> print(circuit)
0: ───@───────
      │
1: ───X───@───
          │
2: ───────X───
>>> gateset = cirq.CZTargetGateset(allow_partial_czs=False)
>>> circuit2 = cirq.optimize_for_target_gateset(circuit, gateset=gateset)
cuit2.moments)
>>> print(circuit2)
0: ────────────────────────────@────────────────────────────────────────────────────────
                               │
1: ───PhXZ(a=0.5,x=-0.5,z=0)───@───PhXZ(a=0.5,x=0.5,z=0)────@───────────────────────────
                                                            │
2: ────────────────────────────────PhXZ(a=0.5,x=-0.5,z=0)───@───PhXZ(a=0.5,x=0.5,z=0)───
>>> print(circuit2.moments)
[cirq.Moment(
    cirq.PhasedXZGate(axis_phase_exponent=0.4999999999999998, x_exponent=-0.5, z_exponent=0.0).on(cirq.LineQubit(1)),
), cirq.Moment(
    cirq.CZ(cirq.LineQubit(0), cirq.LineQubit(1)),
), cirq.Moment(
    cirq.PhasedXZGate(axis_phase_exponent=0.4999999999999998, x_exponent=0.5, z_exponent=0.0).on(cirq.LineQubit(1)),
    cirq.PhasedXZGate(axis_phase_exponent=0.4999999999999998, x_exponent=-0.5, z_exponent=0.0).on(cirq.LineQubit(2)),
), cirq.Moment(
    cirq.CZ(cirq.LineQubit(1), cirq.LineQubit(2)),
), cirq.Moment(
    cirq.PhasedXZGate(axis_phase_exponent=0.4999999999999998, x_exponent=0.5, z_exponent=0.0).on(cirq.LineQubit(2)),
)]

eliottrosenberg commented 9 months ago

Thank you, Nour!!

quantumlib / Cirq

optimize_for_target_gateset returns a non-optimal circuit #6422

Explanation of what's happening in this example