WrathfulSpatula commented 4 years ago

It's long been a personal goal to implement improvements to QUnit that would make it fully fundamentally efficient on a Clifford algebra, if no other gates are used. The system of singly-controlled phase/"inversion" gate buffers, basis transformation between |0>/|1> and |+>/|-> for Hadamard gate application, and the commutation rules between the two of these go a very long way toward this goal.

When an H gate is applied in QUnit, certain phase/inversion buffers, representing gates like CZ and CNOT, will be "flushed" and applied due to limitations like maintaining order of application of non-commutative buffers, and the difficulties in commuting over control bits, as opposed to target bits. I don't think we're far from fundamental equivalence to Gottesman-Knill "stabilizer" simulators, if we can avoid buffer flushes such as for commutation over control bits and non-commutativity of simultaneously buffered gates, (such as maybe with a strict ordering of gate buffer application).

WrathfulSpatula commented 4 years ago

I'm considering implementing stabilizer simulation as a Qrack "layer." I'm going by Aaranson's CHP paper and this Quantum Computing Stack Exchange question on converting between stabilizer and state vector representations.

It might be that a freshly initialized QUnit could easily and efficiently construct a state vector from a stabilizer array representation, based on QUnit's own optimization in terms of separated qubit substates. If I'm understanding correctly, stabilizer simulation can be applied for any sequence of Clifford gates from any given initialization state of the qubit register. Therefore, maybe we could have a stabilizer layer similar to the deprecated "QFusion," which binds to any underlying QInterfacePtr, except commonly on top of QUnit instead of underneath it. (Whenever we encounter a non-Clifford gate, we flush the stabilizer layer to QUnit and reset the stabilizer layer.)

WrathfulSpatula commented 4 years ago

I'm bumping this for my own priority. I'm debating whether the layer should go over or under QUnit, but that's not the primary concern, (and it should probably function with either layer order, regardless of which makes more sense).

We need a new route forward for optimization, and a stabilizer simulation that's convertible to state vector (or a sequence of gates to reconstruct in an arbitrary formalism) seems like a solid option. Even if the convertibility fails to be practical for general HPC use, I'd really like Qrack to support some form of stabilizer simulation even "off on its own," at this point, as this seems to be becoming accepted as a standard QC simulator framework feature, and I think it's clearly an important use case and formalism.

I don't want a separate API interface for stabilizer simulation, though. Even if the conversion is impractically slow and efficient, our stabilizer layer should transparently convert to state vector or QUnit formalism, to support the full QInterface API where exception from stabilizer simulation is necessary. While I am aware that research has been published on handling stabilizer formalism exceptions within stabilizer formalism, we'll first focus of conversion to another general formalism, and we might approach the other method of exception later.

My earlier comment on this issue was confused, about converting between formalisms. However, I'll figure it out now. With nothing but benchmark work on the horizon, with the current features of QUnit seeming to approach their limits without new mathematical design concepts, this is just a safe development priority to adopt where no others are obvious, for now.

WrathfulSpatula commented 4 years ago

There's a complexity theory nuance, here, and I'm going to have to put on my thinking cap on for it. Aaronson and Gottesman write in Section I of Improved Simulation of Stabilizer Circuits:

Section V proves that the problem of simulating stabilizer circuits is complete for the classical complexity class ⊕L. Informally, this means that any stabilizer circuit can be simulated using CNOT gates alone; the availability of Hadamard and phase gates provides at most a polynomial advantage. This result removes some of the mystery about the Gottesman-Knill theorem by showing that stabilizer circuits are unlikely to be capable even of universal classical computation.

Qrack::QUnit is already complete for a strict super set of ⊕L from purely the simulated quantum CNOT gate, though. In fact, I can trivially assert, for anyone familiar with QUnit and VM6502Q, that QUnit is complete for at least a strict superset of ⊕L with CCNOT, (i.e.: containing "efficient" classical computation generally, which any reader is free to easily check). However, QUnit is not generally efficient in the naive application of H with CNOT, (specifically at least an H on a qubit that then becomes a CNOT control bit).

This suggests that there's some transformation whereby QUnit can already basically efficiently simulate the Clifford set gates, probably by some machination of qubit basis. (I think the authors I quote might mean, directly or by inverted corollary, that I could therefore use QUnit to efficiently simulate a quantum computer simulating CHP, which is not immediately useful to us, but it's suggestive of working the other way around. In fact, see also: VM6502Q.) If I start with Section V of that paper, maybe I can find a basis transformation whereby QUnit can subsume stabilizer formalism efficiency entirely, it sounds like.

WrathfulSpatula commented 4 years ago

I'm brainstorming, but I've added the test_quantum_triviality test, which contains a set of gates which QUnit can simulate efficiently, not including the H gate. However, QUnit can simulate many small cases with H gates efficiently, such as acted before the targets of CNOTs only. I'm diagramming, and I think the only case of singly-controlled X and Z gates we don't cover efficiently boils down to solely H before CNOT on the "wire" of the control gate. If we tokenize this case, as we've done with the H gate itself, we might break the wall to Clifford gate efficiency. I attempted this with "Bell basis," but that work needs revamping.

(QUnit can even simulate large circuits with many H gates efficiently, so long as they avoid this specific case before an H gate is cancelled with a subsequent H gate on a CNOT control wire, as H gates have been "tokenized." Hopefully, this seems to be coming together, between Bell basis, Gottesman-Knill, and QUnit.)

WrathfulSpatula commented 4 years ago

The Bell basis work attempts to handle this one elementary inefficient case, but it's spiraling into complication for low returns, where there might be a simpler solution: I could implement a circuit optimizer for QUnit based around CNOT, CZ, and H.

Looking at the diagrams of efficient and inefficient "atomic" Clifford gate sequences for QUnit that I have scrawled out, it seems obvious that QUnit could avoid much representational entanglement if it could look ahead even just a gate or two in sequence, within the Clifford set, and transpile. General transpilation is a "hard" problem, but these cases I'm thinking of likely aren't.

The transpiler would be geared specifically around QUnit's proclivities for efficient and inefficient gate sequences, so it doesn't make sense to open this up as a general layer, yet, but it's two birds with one stone, if we can implement efficient stabilizer circuits with a QUnit local gate optimizer.

WrathfulSpatula commented 4 years ago

473 allows us to handle Bell basis states, in the developing H and singly-controlled "inversion" and phase buffer representation of QUnit. This renders a lot of the above moot. With the avenues that PR opens to us, we either get lucky, or we don't, with GHZ states: a common "control" qubit will not be flushed in acting CNOT on a Bell state, whereas the Bell "target" becoming a subsequent CNOT control will flush buffers.

I put "control" and "target" in quotes, because a Bell state prepared as H(0); CNOT(0, 1); is symmetric between target and control, hence there is no difference if either bit of the pair acts as subsequent control or target. The QUnit buffering system does not yet recognize this, but making it do so is the next priority.

I was hung up for months on fixing that edge case. There was forward motion in other areas of development, like QHybrid, but I thought I'd reached the culmination of what this buffering system in QUnit was going to get us. I see that the road ahead is open, now, and it certainly leads us in the direction of full Clifford set gate efficiency. If and when we become blocked again, I will divert to stabilizer tableau implementation, if necessary.

WrathfulSpatula commented 4 years ago

I'm reading simpler digests of stabilizer formalism, including in Nielson and Chuang and this pedagogical paper. It's helping me develop an intuition for "what stabilizer formalism really does and means," in a way that's simple to explain.

The state is composed of generators for each qubit in the state, the eigenvalues of which are the observables of the state under measurement. For example, a generator Z_i implies the positive (I think) eigenvalue of the Pauli Z operator for the i-th bit. To act a gate on this, we conjugate the generator with the gate. For example, H on the i-th bit takes Z_i -> H_i Z_i conj(H_i) = X_i, meaning that a Z basis eigenstate becomes an X basis eigenstate, (with sign).

Measurements from states generated by the Clifford set seem to always result in either a 100% deterministic or 50/50 probability result in computational basis. We might be able to leverage this fact in QUnit, since we can track which "sub-units" are acted on only by Clifford set gets easily, and this seems to incidentally gives us a performance hint as to the "separability problem." That is, if a sub-unit has evolved from gates entirely in the Clifford set, even if it becomes entangled in QUnit representation, we have a good hint that measurement might collapse the whole sub-state into separable eigenstates, like the Bell and GHZ states. I'm going to try to optimize measurement in QUnit this way, first.

WrathfulSpatula commented 4 years ago

50/50 probability in all nondeterministic measurement cases implies something like separability under transformation between |0>/|1> and |+>/|-> bases in QUnit, but that obviously doesn't quite work. However, it might work in "Fourier basis," under the QFT.

First, I will add the Clifford set sub-unit flag. Then, I will experiment with using this flag for an internal Fourier basis transformation.

WrathfulSpatula commented 4 years ago

Proactively, basically any case that maintains isClifford == true, but entangles a qubit, seems to set probability to 50/50 in the current basis, before further Clifford or non-Clifford operations. (This is somewhat optimistic, but only if QUnit has not fully covered the specifically non-Bell-basis separability that it can.) Subsequent singly-controlled Clifford gates (or H) can take a bit from the 50/50 state to |0>/|1> eigenstate, so we might proactively check this so long as the isClifford flag remains true.

WrathfulSpatula commented 4 years ago

It's pretty simple to argue that, so long as isClifford == true, (and correctly implemented,) in total absence of information (or, rather, equal chance of all available stabilizer states for the length of bits,) half of all possible stabilizer states have computational basis probability 0.5, a quarter have exactly 0, and a quarter have exactly 1. In QUnit, 0 and 1 probabilities in computational basis are definitely separable, (as is exactly |+>/|->, but this can be a costly check). Basically, the generators of every individual qubit are (+/-) X, Y, Z, and I. If the computational basis is Z, then I and Z generators will be separable while X and Y are 50/50, and vice versa if the computational basis is X.

WrathfulSpatula commented 4 years ago

The above worked out, per #475. Yes, Clifford basis non-separability in computation basis is always 50/50 between basis states.

I can't yet beat O(n) to O(n^2) on stabilizer subroutines with QUnit. However. #475 gives us all the necessary superstructure to use stabilizer sub-engines in QUnit. We'll probably be leaning on the n^2 side, with large numbers of probability checks, but I'm thinking that's just the fastest way forward to performance.

WrathfulSpatula commented 4 years ago

Based on what we've learned from QUnit and it's program of "Schmidt decomposition," I can say definitively that 100% probability for either |0> or |1> in computation basis requires a separable state. Amazingly, I think that if we track local +/- sign bits per generator, only updating these local sign bits in response to local gates, then we might beat Aaronson's tableau, in that commuting generators naturally stay separable and with definitive information about sign of the eigenvector. In cases where the distribution of sign is ambiguous between two generators in the same gate, either one can accept the sign difference. I'll be able to test this tonight.

WrathfulSpatula commented 4 years ago

I'm hacking this together, but I've had an opportunity to play with it more. In acting a CNOT gate, we could always flip the signs on both bits, but the above might work if we additionally require "no double sign-flips." Jury is still out, but tableau works as a fall back. We don't need even the full apparati of typical stabilizer simulators.

Again, flipping on either seems to be okay, when sign flip happens.

WrathfulSpatula commented 4 years ago

Okay, throw the earlier rules in the trash. However, the current logic table in the stabilizer branch seems to be right except for one case: we can't distinguish between CNOT on |+0> and |-0> with just "local sign." For the ramifications of this case, Y stabilizers must carry a phase factor of sign with a bit that distinguishes between real and imaginary. It shouldn't be complicated to fix; "local phase" flips are decomposed as their square roots to be shared by Y generator operations. (Probably, the entire group with the imaginary/real phase bit could be handled, as we guessed in originally defining the enum, but it might be redundant with special handling for these Y generator cases.

WrathfulSpatula commented 4 years ago

I'm misunderstanding the stabilizer method, but I have plenty of time to understand it. Disregard most of the implementation details above as confused, but we will have a stabilizer "QEngine" soon, (or fit to a different interface).

WrathfulSpatula commented 3 years ago

With #484 merged, we at least cover Clifford set gates efficiently. We haven't really ultimately managed to leverage this generally in combination with QUnit optimizations, but the QStabilizerHybrid -> QUnit -> QHybrid stack seems to give the general performance of QUnit -> QHybrid with the Clifford set special case performance of a conventional stabilizer simulator. While additional integration work continues, I think we can finally close this issue.

unitaryfund / qrack

Feature: Full Clifford algebra efficiency #367