Revise definitions for gate decomposition

wvlothuizen commented 4 years ago

At DCL, we decompose CZs into two single qubit flux pulses:

"cz q0 q2": ["sf_cz_se q0", "sf_cz_nw q2"],

These sf_cz gates get scheduled separately, and are thus do not always coincide in time, which is a requirement. It would seem natural to add wait statements around the sf_cz gates, but that is not supported ("Error: custom instruction not found for 'wait'")

jvansomeren commented 4 years ago

Hi Wouter,

Any of those proposals is not a solution.

Introducing wait means that the scheduling of other gates is influenced by that, while the only objective is to keep those two together.

The only way doing this is to decompose after scheduling, i.e. scheduling knows about the resources both instructions take and assumes both are scheduled in the same cycle.

Best,

Hans

Op 26 mrt. 2020, om 13:06 heeft Wouter Vlothuizen notifications@github.com<mailto:notifications@github.com> het volgende geschreven:

At DCL, we decompose CZs into two single qubit flux pulses:

"cz q0 q2": ["sf_cz_se q0", "sf_cz_nw q2"],

These sf_cz gates get scheduled separately, and are thus do not always coincide in time, which is a requirement. It would seem natural to add wait statements around the sf_cz gates, but that is not supported ("Error: custom instruction not found for 'wait'")

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_QE-2DLab_OpenQL_issues_303&d=DwMCaQ&c=XYzUhXBD2cD-CornpT4QE19xOJBbRy-TBPLK0X9U2o8&r=kNdT9ewT6pQdYFkBLR_5-ZqsrSTk7k5Hdd7MSC_Vnzg&m=3Uz8ISrP1j-Lf8q-MBGY5w5KDnEnxi-tBNB02-E3geA&s=MfPukkPtjbd349lJpRO0ycUUIEuvGSmdE4NLHeH6aZY&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AEDTBNQPFUP4AO6F6CUCFEDRJNAMRANCNFSM4LUFXCDA&d=DwMCaQ&c=XYzUhXBD2cD-CornpT4QE19xOJBbRy-TBPLK0X9U2o8&r=kNdT9ewT6pQdYFkBLR_5-ZqsrSTk7k5Hdd7MSC_Vnzg&m=3Uz8ISrP1j-Lf8q-MBGY5w5KDnEnxi-tBNB02-E3geA&s=HB6RID4_gSWqfFI0xt4aUCPgm_RZ3cnLTyr2oV4KDnI&e=.

jvansomeren commented 4 years ago

Hi Wouter,

More on this subject in addition to my previous mail:

in tests/test_wait.py you’ll find examples of the use of wait that (should) work
instead of building all kinds of decomposing passes with built-in decomposition descriptions, we could make one general decomposer which is controlled by the platform configuration file exactly as the two decomposers inside the mapper are used (the “making real” and “making primitive” decomposers). See the new mapper documentation in enh/mapperdoc-294. A general decomposer would take a suffix as parameter, and would scan the circuit from start to end for each gate, apply that suffix to each gate in the input circuit/bundles, and try to create it by looking it up in the configuration file; if successful, the result replaces the original, also when it was a decomposition; otherwise, it just uses the original. Then, when it would use the suffix “_cz_decompose”, the conf rule would be: "cz_cz_decompose q0 q2": ["sf_cz_se q0", "sf_cz_nw q2"]

Best,

Hans

Op 26 mrt. 2020, om 13:06 heeft Wouter Vlothuizen notifications@github.com<mailto:notifications@github.com> het volgende geschreven:

At DCL, we decompose CZs into two single qubit flux pulses:

"cz q0 q2": ["sf_cz_se q0", "sf_cz_nw q2"],

These sf_cz gates get scheduled separately, and are thus do not always coincide in time, which is a requirement. It would seem natural to add wait statements around the sf_cz gates, but that is not supported ("Error: custom instruction not found for 'wait'")

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_QE-2DLab_OpenQL_issues_303&d=DwMCaQ&c=XYzUhXBD2cD-CornpT4QE19xOJBbRy-TBPLK0X9U2o8&r=kNdT9ewT6pQdYFkBLR_5-ZqsrSTk7k5Hdd7MSC_Vnzg&m=3Uz8ISrP1j-Lf8q-MBGY5w5KDnEnxi-tBNB02-E3geA&s=MfPukkPtjbd349lJpRO0ycUUIEuvGSmdE4NLHeH6aZY&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AEDTBNQPFUP4AO6F6CUCFEDRJNAMRANCNFSM4LUFXCDA&d=DwMCaQ&c=XYzUhXBD2cD-CornpT4QE19xOJBbRy-TBPLK0X9U2o8&r=kNdT9ewT6pQdYFkBLR_5-ZqsrSTk7k5Hdd7MSC_Vnzg&m=3Uz8ISrP1j-Lf8q-MBGY5w5KDnEnxi-tBNB02-E3geA&s=HB6RID4_gSWqfFI0xt4aUCPgm_RZ3cnLTyr2oV4KDnI&e=.

wvlothuizen commented 4 years ago

decomposing after scheduling could maybe help (once implemented), but then the scheduler wouldn't know the duration of the cz (unless you would also define a gate for cz, which sound counterintuitive). Still I thing the currents waits could do the job, if they could be used in gate decomposition

jvansomeren commented 4 years ago

Hi Wouter,

I think I understand what you want to do with waits, something like:

cz2 %0 %1 : "wait %0 %1 0" , "cz1 %0", "cz1 %1", "wait %0 %1 0” ;

The 0 is the wait makes it a barrier, without taking time.

This works in the prescheduler which only works with dependences, but doesn’t work with the rcscheduler, i.e. when resource constraints are taken into account. The resource constraint system takes every gate one by one, i.e. independent of each other. So, when one cz1 is scheduled by allocating resources, that in a next round, the next cz1 is scheduled, hopefully in the same processor cycle, in parallel to the first cz1. But when the resources that are required by that second cz1 are not available, this second cz1 is delayed until the resources are available, and the first cz1 is not rescheduled as a consequence of this. There is no knowledge that these two MUST be in parallel. The solution with waits above is necessary but not sufficient.

Every solution to this must have knowledge that the full composition of cz2 above must be scheduled as a whole, or not.

The one I proposed by decomposing cz2 into cz1’s after the rcscheduler is one way. Just schedule cz2 as a unit, as a primitive gate, indeed with knowledge of its duration. This matches a conceptual model of the hardware: there is a cz2 gate. Under the hood this is decomposed to smaller ones. This solution follows the pattern of the current post decomposer in the cc_light compiler. It is a very simple extension, in fact, a similar one has already been implemented: see the option cz_mode. In manual mode, one programs with waits, etc. and takes full resposibility; this may not work in the rcscheduler; see above. In auto mode, the post decomposer decomposes automatically each cz into some subgates, and it is steered by the resource constraints that indicate which qubits need to be detuned. An new mode could be implemented similar to auto that would decompose into a set of cz1’s. A complication is how to know into which cz1’s to decompose; when it is one for each qubit, that is simple; but that doesn’t take detuning into account. So support equivalent to the decomposition above, is easy to implement.

Another solution is implement a new kind of composition rule, guaranteeing that all primitives of the composition are scheduled as a whole, in a fixed time relation, or none. See the back-end documentation of the CoSy compiler generation system how this could be done. There are several options there, but all these are intrusive in the current system. I won’t go further into details here, unless there is a need for this solution, i.e. when the cz is not the only case needed and there is a general need for such a facility. In general processors an example is post/pre-increment addressing in conjunction with indexing and load/store.

Best,

Hans

Op 1 apr. 2020, om 16:50 heeft Wouter Vlothuizen notifications@github.com<mailto:notifications@github.com> het volgende geschreven:

decomposing after scheduling could maybe help (once implemented), but then the scheduler wouldn't know the duration of the cz (unless you would also define a gate for cz, which sound counterintuitive). Still I thing the currents waits could do the job, if they could be used in gate decomposition

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_QE-2DLab_OpenQL_issues_303-23issuecomment-2D607295182&d=DwMCaQ&c=XYzUhXBD2cD-CornpT4QE19xOJBbRy-TBPLK0X9U2o8&r=kNdT9ewT6pQdYFkBLR_5-ZqsrSTk7k5Hdd7MSC_Vnzg&m=-5BKGUHUn-f4T_jmSzTfttl8M1_EXArhwjIwdcxJY-4&s=dYQTsb13-mR0umKQEAMxfwzgAR2BZaHAJ-FDBxkUVJY&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AEDTBNQ2MZOIJEKCASOKNRDRKNIDTANCNFSM4LUFXCDA&d=DwMCaQ&c=XYzUhXBD2cD-CornpT4QE19xOJBbRy-TBPLK0X9U2o8&r=kNdT9ewT6pQdYFkBLR_5-ZqsrSTk7k5Hdd7MSC_Vnzg&m=-5BKGUHUn-f4T_jmSzTfttl8M1_EXArhwjIwdcxJY-4&s=12I08Cf_RPjsQLYZfxH6hK490X92kon4FmDMoszZ02Y&e=.

wvlothuizen commented 4 years ago

The necessity for support of barrier/wait within gate decomposition arises from the 'single qubit flux' gates that were introduced to practically allow 2 qubit gates with the CC-light (and which are a bit of a hack).

The CC backend could easily be changed to allow specification of 2 (3) codewords for a 2 (3) qubit gate, by changing JSON field 'static_codeword_override' from a scalar to a vector. One could then write:

        "cz_se_nw": {
            "duration": 80,
            "matrix": [ [0.0,1.0], [1.0,0.0], [1.0,0.0], [0.0,0.0] ],
            "type": "flux",
            "cc_light_instr": "cz",
            "cc": {
                "ref_signal": "two-qubit-flux", 
                "static_codeword_override": [2,4]  // codeword 2 on SE and 4 on NW
            }
        },

and

        "cz_se_nw_park": {
            "duration": 80,
            "matrix": [ [0.0,1.0], [1.0,0.0], [1.0,0.0], [0.0,0.0] ],
            "type": "flux",
            "cc_light_instr": "cz",
            "cc": {
                "ref_signal": "three-qubit-flux", 
                "static_codeword_override": [2,4,5]  // codeword 2 on SE, 4 on NW, 5 on park
            }
        },

in conjunction with:

        "cz q0 q2": ["cz_se_nw q0,q2"],
        "cz q2 q0": ["cz_se_nw q0,q2"],
        "cz q2 q3": ["cz_sw_ne_park q2,q3,q4],
        "cz q3 q2": ["cz_sw_ne_park q2,q3,q4],

To the scheduler this just uses 2 and 3 qubit gates, the expansion to codewords is done in the backend.

I still think that a wait/barrier should be available in gate decomposition though, it should very much act as a C macro IMHO

jvansomeren commented 4 years ago

@wvlothuizen wrote: Still I thing the currents waits could do the job, if they could be used in gate decomposition, and, I still think that a wait/barrier should be available in gate decomposition though, it should very much act as a C macro IMHO It was argued that this doesn't help for cz expansion. So, what is the use case? I didn't try but when defining a 0 duration custom-gate with 2 qubit arguments with the name barrier2, wouldn't that be allowed/supported in the current gate decomposition?

QuTech-Delft / OpenQL

Revise definitions for gate decomposition #303