Open tanujkhattar opened 4 months ago
re: Adjoint: I consider this a pretty fundamental limitation of the existing t-counting protocol which subverts the decomposition hierarchy. Costs should depend on the costs of the callees.
re: Adjoint: I consider this a pretty fundamental limitation of the existing t-counting protocol which subverts the decomposition hierarchy. Costs should depend on the costs of the callees.
Well, the existing t-counting protocol says (a) "You either provide me the cost yourself" or (b) "I'll decompose you and try to figure it out myself". In the cirq land, the _InverseCompositeBloq
(equivalent of Adjoint
bloq) did not implement the _t_complexity_
method and therefore for an adjoint bloq, we'd always decompose and figure out the cost instead of relying on the hardcoded values. If we simply delete the implementation of Adjoint._t_complexity_
, then everything will work without needing to add a new adjoint: bool
parameter because the protocol will figure out the cost of the Bloq using the cost of it's callees.
The issue arises in the scenarios where we want all of the following
A way for users to hardcode the cost of a bloq
by overriding _t_complexity_
A way for users to also hardcode the more complicated cases when bloq
is wrapped inside a meta bloq. For example: Adjoint(bloq)
and Controlled(bloq)
. Right now (traditionally in Cirq-FT) this case is handled by users explicitly overriding the adjoint(self)
and controlled(self)
methods to return custom bloqs where the hardcoded formulas have the right context and continue to work. An example is the And
bloq.
The problem with MultiAnd
is that we want to treat it as a "non special" Bloq and not have users override the adjoint
method. But, we somehow also want to communicate the context to the protocol (MultiAnd._t_complexity_
) that the Bloq is wrapped inside a Meta bloq.
Point (3) is new and needs more design. This has nothing to do with t-complexity protocol subverting the decomposition hierarchy. It's an optional optimization in step (2) which can be avoided by deleting the implementation of _t_complexity_
.
It's to support the case where we have the call graph callees but not a full decomposition
It's to support the case where we have the call graph callees but not a full decomposition
That case is now already supported after https://github.com/quantumlib/Qualtran/pull/740 and again is unrelated to point (3) above.
It would be impossible to override _t_complexity_
on MultiAnd
and have it be correct
It would be impossible to override
_t_complexity_
onMultiAnd
and have it be correct
The design that needs attention here is whether we want protocols like _t_complexity_
accepting in information about the meta-bloq that wraps it. In the current framework, the way to make _t_complexity_
work is either
a) Return a custom MultiAnd(adjoint=True)
(similar to what we do for And
) so users have the context about whether the bloq is adjoint or not when overriding the _t_complexity_
.
b) Return a Adjoint(MultiAnd())
but Adjoint._t_complexity_
uses the decomposition and does not directly use the hardcoded subbloq._t_complexity_
; because the t-complexity for sub-bloq was defined for the forward case and not the adjoint case.
We earlier used to do (a) and now we do (b) for MultiAnd
; both of which are correct and are currently well supported ways. The philosophy in this situation is that if a user needs to override a custom T-complexity formula for a specific Bloq variant (like adjoint, controlled); they should return a new custom bloq by overriding the adjoint(self)
and controlled(self)
and the new custom bloq would have the right t-complexity formula (or other protocols) overwritten.
It seems like in your original approach you wanted to do
(c) Where _t_complexity_(self)
gets one or more arguments (eg: adjoint: bool
, ctrl: CtrlSpec
etc.) that provides the Bloq with information about the Meta-Bloq that calls it and users can use this information to hardcode gate count formulas of the Bloq for that case. This is a new and slightly weird use case and I think it's worth discussing the implications of this design in more detail.
No, I don't think the bloq author has the authority to annotate a bloq with T counts unless they are direct callees. It's a leaky abstraction when you jump levels. I guess this is most similar to (b); but it begs the question why even have this method when we can never use it
It's a leaky abstraction when you jump levels.
I agree. We should not have (c)
when we can never use it
"Never use it" is a misclassification I think. We (can) use the hardcoded values for all forward cases and non-trivial adjoints / controls; which forms a large part of the codebase.
why even have this method
There are two goals I think
bloq.logical_counts()
which just gives them the logical counts. The ease of access of the API is worth it; since it's so commonly used. logical_counts
directly. The primary advantage here would be performance optimization, especially when the call graph is deep and/or expensive to construct. The secondary advantage is verification, that the analytical formula for T-counts matches the value obtained by recursively combining values from the decomposition / call graph. You'd have to guarantee that nothing in the bloq's call graph has or will ever have a custom adjoint or controlled implementation (that changes the t count)
Right now, the
TComplexity
protocol is broken as described in https://github.com/quantumlib/Qualtran/pull/732This issue is to discuss the future of TComplexity protocol since there have been informal discussions and upgrades to either deprecate it in favour of the new generalized costs framework or make improvements to it and make it better supported.
First, I'd like to highlight that the
TComplexity
protocol; or an equivalentLogicalCounts
protocol that captures the most commonly used logical counts people care about when thinking about fault tolerant resource estimation, is extremely important and we should have an easy way (read single function call) to obtain this value for any givenbloq
. TheTComplexity
/LogicalCounts
class is also the input format to invoke the microsoft azure resource estimator.The current way to obtain this value without going via the t-complexity protocol is a) Compute
sigma
viabloq.call_graph()
b) Callt_counts_from_sigma(sigma)
assumingsigma
was computed without any fancy parameters (eg: no truncation of the call graph usingmax_depth
parameter; etc.)We need to have a single function call (eg:
bloq.t_complexity()
orbloq.logical_counts()
) that gives the desired counts. Inspecting the call graph should be done only when then bloq users need to get into then weeds of how exactly the logical counts were computed. Having agreed on this premise, let's lay down some requirements (I'll refer to the protocol as TComplexity for now, but we should probably change it tological_counts
in the future) -a) The most important requirement is that the it needs to be
_fast_
for the general case. Expecting users to overridebuild_call_graph
should be the recommended way but if users do not provide this decomposition, we should still be able to support fast computations of logical counts. The cirq stylet_complexity
uses a global cache to store and retrieve the logical counts of bloqs. This has an advantage of reusing any structure that the bloq decomposition has and ends up being sufficient for a majority of cases. This is because a lot of the times, the very high gate counts are present because we do phase estimation on a unitaryU
which ends up repeatingU
a lot of times.b) It should report more than just T-counts and should be sufficient as an "input" to our qec overhead methods defined in
bloqs/surface_code
. Some other useful logical counts are - (a) Toffoli counts (b) qubit usage (c) Counts of (rotation, eps) pairs instead of just number of rotations (d) Rotation depth (azure model uses this)Design options going forward: (a) Delegate computation of
LogicalCounts
to the bloq call graph -> bloq counts construction. Biggest bottleneck here is performance I believe. For example: I implemented a naive version of this approach in https://github.com/quantumlib/Qualtran/pull/732, specifically_t_counts_for_bloq
called from_populate_flame_graph_data
and its VERY slow. Thephase_estimation_of_quantum_walk
notebook is a good example of bloqs for which I want to compute the T-Complexity and plot framegraphs. I can do the former easily, but the latter can be done only for very small cases. The biggest reason for the slowdown is that I need thesigma
for each node of the call graph and not just the root node. So every time I visit a node in the call graph, I do anO(n)
computation to compute thesigma
.(b) We improve the cirq style
t_complexity
method to not compute an explicit call graph and directly, implicitly, compute only the logical counts we care about. We've already demonstrated that it's pretty fast. There are some design challenges here; for exampleAdjoint(subbloq)
expectssubbloq
to accept anadjoint
parameter whensubbloq
does not have a customadjoint
implemented but not when thesubbloq
provides a custom implementation ofadjoint
. This seems like a design smell and is confusing at best (Eg: if a Bloq A decomposes into a bloq B whereB
has custom adjoint butA
does not thenA._t_complexity_
should accept anadjoint=True
but not pass on the flag toB._t_complexity_
. A and B areMultiAnd
andAnd
gates right now)cc @mpharrigan