quantum-compiler / quartz

The Quartz Quantum Compiler
Apache License 2.0
76 stars 19 forks source link

Incorrect optimization of multi-parameter gates #88

Closed hushaohan closed 1 year ago

hushaohan commented 1 year ago

It's unclear if/how much parameterized gates are supported -- the README does state

We do not support parameterized gates currently.

But parameterized gates (e.g., u1, u2, and u3) are implemented. However, I'm observing incorrect optimization results. For example, below is a minimal experiment to reproduce an optimization error:

To generate ECC:

The input circuit qasm to be optimized:

OPENQASM 2.0;
include "qelib1.inc";
qreg q[1];
u2(0, pi) q[0];
u2(0, pi) q[0];

where the two u2 gates should cancel each other. But the "optimized" circuit as produced by quartz is:

include "qelib1.inc";
qreg q[1];
u2(pi*1.000000,pi*1.000000) q[0];

which is mathematically incorrect.

zikun-li commented 1 year ago

Thanks for reporting the issue! Could you please post your code so that we can better trace the bug?

hushaohan commented 1 year ago

Thanks for following up, @zikun-li!

To reproduce, below please find my test code:

#include "gen_ecc_set.h"
#include "quartz/tasograph/tasograph.h"

using namespace quartz;

int main() {
  int q = 1, m = 4, n = 2;
  std::vector<GateType> gate_set{GateType::add, GateType::u2};
  std::string ecc_name = "test_";
  std::string ecc_fpath = ecc_name + "complete_ECC_set.json";
  gen_ecc_set(gate_set, ecc_name, true, true, q, m, n);

  const std::string qasm = R""""(
OPENQASM 2.0;
include "qelib1.inc";
qreg q[1];
u2(0, pi) q[0];
u2(0, pi) q[0];
  )"""";

  gate_set.insert(gate_set.end(), {
    GateType::input_qubit, GateType::input_param
  });
  Context ctx(gate_set);
  auto graph = Graph::from_qasm_str(&ctx, qasm);

  auto optimized_graph = graph->optimize(
    &ctx, ecc_fpath, "trivial_test", /*print_message=*/ true
  );

  std::cout << "\ninput circuit:" << std::endl;
  graph->to_qasm(true, false);
  std::cout << "\noutput circuit:" << std::endl;
  optimized_graph->to_qasm(true, false);
  return 0;
}

which produces the following output for me:

*** ch(,q=t) = 102
BFS: 1 representative DAGs to search with 0 gates.
Considering a total of 103 DAGs split into 103 hash values...
Processed 103 hash values that had only 1 circuit sequence, now processing the remaining 0 ones with 2 or more circuit sequences...
Start checking equivalence with different hash...
Solver invoked 0 times to find 0 equivalences with different hash, including 0 out of 0 possible equivalences under phase shift.
0 equivalences found in 0.022850895998999476 seconds (solver invoked 0 times for 103 DAGs with 103 different hash values and 0 potential equivalences), output 103 equivalence classes.
Json saved in 0.007132362996344455 seconds.
BFS: 102 representative DAGs to search with 1 gates.
Considering a total of 487 DAGs split into 298 hash values...
Processed 109 hash values that had only 1 circuit sequence, now processing the remaining 189 ones with 2 or more circuit sequences...
Start checking equivalence with different hash...
Solver invoked 3 times to find 3 equivalences with different hash, including 0 out of 0 possible equivalences under phase shift.
189 equivalences found in 0.8926885240070987 seconds (solver invoked 189 times for 487 DAGs with 298 different hash values and 189 potential equivalences), output 298 equivalence classes.
Json saved in 0.036345964996144176 seconds.
Representative set saved in 0.005 seconds.
Before ECC simplification: num_total_dags = 487, num_equivalence_classes = 295, #transformations test = 384
test generated. Running Time (s): 1.365
Pruning Time (s): 0.023
Verification Time (s): 1.309
Num_total_dags = 70, num_equivalence_classes = 35
*** Number of transformations of test = 70
Number of xfers: 70
greedy_optimize(): Number of xfers that reduce cost: 0
greedy_optimize(): cost optimized from 2 to 2
[trivial_test] Best cost: 1.000000      candidate number: 1     after 0.000 seconds.
[trivial_test] Best cost: 1.000000      candidate number: 0     after 0.000 seconds.

input circuit:
OPENQASM 2.0;
include "qelib1.inc";
qreg q[1];
u2(pi*0.000000,pi*1.000000) q[0];
u2(pi*0.000000,pi*1.000000) q[0];

output circuit:
OPENQASM 2.0;
include "qelib1.inc";
qreg q[1];
u2(pi*1.000000,pi*1.000000) q[0];

Please let me know if you need me to also post the detailed steps for compiling and running the test code.

zikun-li commented 1 year ago

Thanks for pointing out the issue, @hushaohan! We have fixed this issue in the commit 5f1ef67. Now the output of the optimization should be:

output circuit:
OPENQASM 2.0;
include "qelib1.inc";
qreg q[1];
u2(pi*0.000000,pi*1.000000) q[0];
u2(pi*0.000000,pi*1.000000) q[0];

As you can see, even if the bug is fixed, currently Quartz cannot cancel out the two u2(pi*0.000000,pi*1.000000) q[0];. The reason is that Quartz only generate symbolic transformation rules without constant parameters. For example, there is a rule generated by Quartz:

u2(p0, p1) q[0]; u2(p2, p3) q[0]; = u2(p3, p1) q[0]; u2(p2, p0) q[0];

where p0, p1, p2, p3 are symbolic parameters that can take any real value. However, Quartz currently does not support rules that work only when the parameters take specific constant values (e.g. pi). So currently we cannot combine two u2(pi*0.000000,pi*1.000000) q[0]; into identity.

Enabling the generator to generate transformation rules that involves constants is in our plan.

xumingkuan commented 1 year ago

We have fixed this issue in the commit 5f1ef67.

Is the bug caused by an incorrect application of rotation merging to unsupported gates (rotation merging does not support u2)?

zikun-li commented 1 year ago

We are not calling rotation merging here. The reason of this bug is that we have a pass that eliminate rotation gates whose rotation angles are equal to 2kpi based on an assumption that they are equivalent to identity. However, this assumption is only true for rx, ry, rz, u1. This bug is caused by mistakenly eliminating u2(0, 0) q[0];.

xumingkuan commented 1 year ago

I see... u2(0, 0) not equivalent to identity is kind of counterintuitive and Qiskit already deprecated U2. u3(0, 0, 0) is equivalent to identity.

hushaohan commented 1 year ago

I do want to mention that a similar test on u1 seems to work:

#include "gen_ecc_set.h"
#include "quartz/tasograph/tasograph.h"

using namespace quartz;

int main() {
  int q = 1, m = 2, n = 2;
  std::vector<GateType> gate_set{GateType::add, GateType::u1};
  std::string ecc_name = "test_";
  std::string ecc_fpath = ecc_name + "complete_ECC_set.json";
  gen_ecc_set(gate_set, ecc_name, true, true, q, m, n);

  const std::string qasm = R""""(
OPENQASM 2.0;
include "qelib1.inc";
qreg q[1];
u1(1) q[0];
u1(2) q[0];
  )"""";

  gate_set.insert(gate_set.end(), {
    GateType::input_qubit, GateType::input_param
  });
  Context ctx(gate_set);
  auto graph = Graph::from_qasm_str(&ctx, qasm);

  auto optimized_graph = graph->optimize(
    &ctx, ecc_fpath, "trivial_test", /*print_message=*/ true
  );

  std::cout << "\ninput circuit:" << std::endl;
  graph->to_qasm(true, false);
  std::cout << "\noutput circuit:" << std::endl;
  optimized_graph->to_qasm(true, false);
  return 0;
}

gives me

...

input circuit:
OPENQASM 2.0;
include "qelib1.inc";
qreg q[1];
u1(pi*0.31831) q[0];
u1(pi*0.63662) q[0];

output circuit:
OPENQASM 2.0;
include "qelib1.inc";
qreg q[1];
u1(pi*0.95493) q[0];

fwiw


EDIT: I suppose this is the result of a pass of rotation merging, as opposed to transformation?

zikun-li commented 1 year ago

Actually this is the result of applying transformation. Because there is a rule that u1(p0) q[0]; u1(p1) q[0]; equals to p2 = add(p0, p1); u1(p2) q[0];. You can view it as u1(p0) q[0]; u1(p1) q[0]; = u1(p0 + p1) q[0];. And in our current optimize API rotation_merging is not invoked.

zikun-li commented 1 year ago

@xumingkuan u3 is added in the latest commit.

hushaohan commented 1 year ago

Actually this is the result of applying transformation. Because there is a rule that u1(p0) q[0]; u1(p1) q[0]; equals to p2 = add(p0, p1); u1(p2) q[0];. You can view it as u1(p0) q[0]; u1(p1) q[0]; = u1(p0 + p1) q[0];. And in our current optimize API rotation_merging is not invoked.

I see, thanks for the explanation!

It would be great if this would also work for u2 or u3, or in general any customized gates with more than 1 parameter.

xumingkuan commented 1 year ago

It would be great if this would also work for u2 or u3

For u2 and u3, the equivalences are not as obvious as u1, and I believe we do support the equivalences (except for the ones using pi/2 here for now): https://quantumcomputing.stackexchange.com/questions/13480/optimize-chains-of-single-qubit-u1-u2-u3-gates-by-combining-them-into-a-single

hushaohan commented 1 year ago

It would be great if this would also work for u2 or u3

For u2 and u3, the equivalences are not as obvious as u1, and I believe we do support the equivalences (except for the ones using pi/2 here for now): https://quantumcomputing.stackexchange.com/questions/13480/optimize-chains-of-single-qubit-u1-u2-u3-gates-by-combining-them-into-a-single

Thanks for the pointer.

My end goal is actually to use quartz to optimize circuits of other gatesets (e.g. containing the PhasedX single-qubit two-param gate, and the ZZPhase 2-qubit single-param gate, as defined at https://cqcl.github.io/tket/pytket/api/optype.html) by supplying their corresponding matrix implementations in src/quartz/gate and src/python/verifier/gates.py. Is it currently possible?

Judging but the snippets below,

https://github.com/quantum-compiler/quartz/blob/9f88caaa4327904532b9a3929336bda27a8fd761/src/quartz/generator/generator.cpp#L331

https://github.com/quantum-compiler/quartz/blob/9f88caaa4327904532b9a3929336bda27a8fd761/src/quartz/generator/generator.cpp#L466-L467

I'm assuming at least parameterized 2-qubit gates are not supported as far as my goal is concerned?

hushaohan commented 1 year ago

Actually this is the result of applying transformation. Because there is a rule that u1(p0) q[0]; u1(p1) q[0]; equals to p2 = add(p0, p1); u1(p2) q[0];. You can view it as u1(p0) q[0]; u1(p1) q[0]; = u1(p0 + p1) q[0];. And in our current optimize API rotation_merging is not invoked.

I see, thanks for the explanation!

It would be great if this would also work for u2 or u3, or in general any customized gates with more than 1 parameter.

Btw, are these rules automatically discovered or manually added in advance? Where/how are they stored? TIA!

xumingkuan commented 1 year ago

I'm assuming at least parameterized 2-qubit gates are not supported as far as my goal is concerned?

It is not supported for now simply because we haven't used them in any benchmarks/experiments, and it would be easy to support them. Feel free to (open PR to) add more gates in src/quartz/gate and we will take care of the changes in generator.cpp.

hushaohan commented 1 year ago

I'm assuming at least parameterized 2-qubit gates are not supported as far as my goal is concerned?

It is not supported for now simply because we haven't used them in any benchmarks/experiments, and it would be easy to support them. Feel free to (open PR to) add more gates in src/quartz/gate and we will take care of the changes in generator.cpp.

Sounds great, thanks!

Btw, quartz already has cu1 and cp, both are 2-qubit and single-param gates as well.

xumingkuan commented 1 year ago

Actually this is the result of applying transformation. Because there is a rule that u1(p0) q[0]; u1(p1) q[0]; equals to p2 = add(p0, p1); u1(p2) q[0];. You can view it as u1(p0) q[0]; u1(p1) q[0]; = u1(p0 + p1) q[0];. And in our current optimize API rotation_merging is not invoked.

I see, thanks for the explanation! It would be great if this would also work for u2 or u3, or in general any customized gates with more than 1 parameter.

Btw, are these rules automatically discovered or manually added in advance? Where/how are they stored? TIA!

These are automatically discovered by Quartz (despite probably being manually found in the Stackexchange link). They are generated and stored in the ecc_fpath file in your code snippet. The Json file is grouped by the ECCs and inside the ECCs are circuits; there is some metadata at the beginning of each circuit and the following are the gates. For example, ["u2", ["Q0"],["Q0", "P2", "P1"]] means taking input qubit 0, parameter 2, parameter 1, and output qubit 0. (We may change the format of the Json file as we develop.)

xumingkuan commented 1 year ago

Btw, quartz already has cu1 and cp, both are 2-qubit and single-param gates as well.

Good point! I guess no one has used them to generate an ECC set yet..

hushaohan commented 1 year ago

Btw, quartz already has cu1 and cp, both are 2-qubit and single-param gates as well.

Good point! I guess no one has used them to generate an ECC set yet..

May I ask if you guys have any recent plans on adding support for these?

Also an unrelated side question: for the graph->optimize(...) call in my earlier snippet, is there a way to enable openmp parallelization for the heavy-lifting part? From my playing with it, seems only a single core is busy crunching numbers with the rest idle :)

xumingkuan commented 1 year ago

Btw, quartz already has cu1 and cp, both are 2-qubit and single-param gates as well.

Good point! I guess no one has used them to generate an ECC set yet..

May I ask if you guys have any recent plans on adding support for these?

Sure, should be supported in my latest PR.

Also an unrelated side question: for the graph->optimize(...) call in my earlier snippet, is there a way to enable openmp parallelization for the heavy-lifting part? From my playing with it, seems only a single core is busy crunching numbers with the rest idle :)

It is indeed possible but non-trivial to make use of openmp. Because of the randomness of the search, what we were doing was simply running the same optimization multiple times to compute the best (or average) result, and this is naturally parallelizable.

I believe we're all busy with things like paper deadlines within the following month and we don't plan to support openmp recently. If you feel like openmp is useful, welcome to open PRs to contribute!