unitaryfund / qrack

Comprehensive, GPU accelerated framework for developing universal virtual quantum processors
https://qrack.readthedocs.io/en/latest/
GNU Lesser General Public License v3.0
164 stars 36 forks source link

Review method inlining (header cleanup) #966

Closed WrathfulSpatula closed 2 years ago

WrathfulSpatula commented 2 years ago

Qrack doesn't have a clear design philosophy on choosing which methods and functions to inline, (though, for what it does have for positive choices, we still have Benn Bollay to thank, largely). We could likely set a standard relatively quickly by reviewing the code top-to-bottom, once more, to "eyeball" smaller and simpler method bodies for headers, and larger and more complicated ones for .cpp modules, depending on what we need to #include at the point of definition. (Basically, we don't want to be forced to include anything in a header that we could instead relegate to .cpp module body files.) This need not change any method signatures.

WrathfulSpatula commented 2 years ago

Optimizing with inline is mostly educated guesswork, as far as I have ever known. A hard educated guess to make is whether convenience gate definitions over Mtrx(), like H(), which is basically a 2x2 matrix definition and a Mtrx() call, might be a good case to inline. On the one hand, it could significantly improve performance, but it also bloats the QInterface header. It's an obviously important case to experiment on, but it's not easily predictable what this would actually cause and entail, I think. So, we leave that important case to the end.

First, I'm simply going to eyeball headers and cut overly complex method body definitions. In tandem, for the same simulator layer body files, I'll guess at good, small methods to inline. We're specifically not going to touch convenience matrix operator gates like H() or Pauli gates, til afterward.

WrathfulSpatula commented 2 years ago

The most or only obvious difference from inlining the "convenience API" (i.e., constant gate matrix shortcut wrappers on Mtrx()) is a slightly but significantly larger binary. Hence, we probably shouldn't inline the entire "convenience API." However, speaking as the author of Qrack, [H,X,Y,Z,S,T,IS,IT] (i.e., Pauli, Clifford, and T) is likely fairly commonly used for decomposition throughout the Qrack API, or most of that set. Perhaps T and its inverse could be removed, but I'm deciding at an executive level to inline only these gate definitions, for now. (This is guess work, but inline optimization is not hard-and-fast.)

WrathfulSpatula commented 2 years ago

By the same reasoning, we also include [CNOT, CY, CZ] (and "anti-"variants). I guess, base QStabilizer gates qualify for this argument, but they are wrapped in ParFor(), which might preclude this reasoning. However, stabilizer row operations actually shrink the binary, if inlined.

We might back out all of the above inline gate inclusions, but let's experiment with this. (We'll keep the stabilizer row operations inline, if that's the case.)

WrathfulSpatula commented 2 years ago

Sorry for the quick reversal, but inline gates do not make an obvious difference in speed, while they do increase the binary size. However, the stabilizer row operations definitely do reduce binary size, when inline.