Closed boschmitt closed 5 months ago
As far as I understand this feature works by introducing swap gates. Since swap gates will have a fidelity, this will affect the overall error budget of the intended algorithm. While the swap gates are necessary to achieve the aims of the feature, I do not see an option to run in verbatim mode and that would be undesirable where close control of the errors is sought.
Hi @owen-oqc - yes, that's correct. It's worth noting that some other compiler passes (like multicontrol-decomposition
) could also introduce ancilla qubits, so it may be tricky to give the user full control over the manual placement of all qubits. Would you mind opening a new issue to start a discussion there?
We may have to do a near-term solution for 0.5 and a longer-term solution, but I think it merits having it's own issue to clarify the precise desires. For example - we could imagine these other related desires (below), and would like to give the community an opportunity to weigh in.
qubit-mapping
pass? (Presumably, yes.)Closing this as is for now. Further improvements will be covered as part of introducing a smarter qubit scheduling logic.
The following PRs are related to this mapping pass:
666
667
668
761
610
837
838
Background
The physical qubits in quantum devices might not be fully connected, which means that not every pair can participate in the same two-qubits operations. These connectivity restrictions are known as coupling constraints. In our compiler, the IR has no mechanism to enforce them, i.e., we can apply an operation between any pair of qubits. Therefore, our quantum programs typically cannot be directly executed on these restricted quantum devices. We refer to them as unmapped programs and define virtual (or logical) qubits and operation as those present in them. The task of finding a mapping of virtual operations to allowed physical ones is known as quantum circuit mapping. The completion of this task is not always possible without applying additional operators---typically, swaps.
We can model the connectivity requirements of an unmapped program using an undirected graph $G_{r} = (V, E_r)$, where $V$ is the set of virtual qubits $V = { v_0, v1, ..., v{n-1}}$ and $E_r$ is a set of qubit pairs ${v_i, v_j}$ required by the operations in the program. (Note that one-qubit operations can be safely ignored since the coupling constraints do not affect their mapping.) We can model the coupling constraints in a similar way. A undirected graph $G_p = (P, E_p)$ where $P = { p_0, p1, ..., p{m-1}}$ is the set of physical (device) qubits and an edge ${p_u, p_w} \in E_p$ means that a two-qubit operation can be executed using the two physical qubits $p_u$ and $p_w$.
Placement. The goal is to find a subgraph monomorphism $\pi : V \to P$ that respects ${v_i, v_j} \in E_r \implies {\pi(v_i), \pi(v_j)} \in E_p$. If such a morphism exists, then the mapping task requires only a relabeling of the virtual qubits to physical qubits, i.e., replace each virtual qubit $v_i$ in the unmapped program by a physical qubit $\pi(v_i)$. Otherwise, we must resort to routing.
Routing. Given an initial placement, routing transforms a program so that all two-qubit operations are on physically adjacent qubits. Thus, our goal becomes finding a way of mapping a program on a device architecture with low overhead, whether in the number of additional operation or depth.
Example: Take the following quake kernel
The graph that models the connectivity requirements, $G_r$, is:
We can map this kernel to the following device (left) without the need to add new operations:
However, we cannot do the same for the following one:
In this case, there was no placement that satisfied all connectivity requirements, so we choose one and now we need a to modify the circuit so that the operations between qubits
$1
and$2
(red edge) can be applied.Prototype
Device
This prototype adds a
Device
type (cudaq/Support/Device.h
) to hold information about the target device. Currently, it only has what is necessary for mapping:GraphCSR
(cudaq/ADT/GraphCSR.h
)cudaq/Support/Graph.h
) in each node (qubit).The
Device
class has a type to represent physical qubits:Device::Qubit
, which is just an unsigned integer. The class also has static methods to create mock devices:Device::path(...)
,Device::ring(...)
,Device::start(...)
andDevice::grid(...)
.Mapping pass
I will use an example to explain the inner works of the pass. Take the following quake kernel:
Suppose we want to map this kernel to a device with three qubits in a path topology, i.e.,
(p0) --- (p1) --- (p2)
, wherepn
denotes device (physical) qubit $p_n$. The mapper requires kernels to be represented as a directed acyclic graph (DAG), and thus only works with Quake’s value model, i.e., the model where operations use wires instead of qubit references. Hence, before mapping, we need to convert the kernel:The mapper starts by analyzing the kernel to make sure the preconditions for mapping are met. Currently the preconditions are:
quake.apply
operations. (limitation)During this analysis, the mapper also takes to opportunity to label each wire with its corresponding virtual qubit and collect the source operations for all qubits, i.e.,
quake.null_wire
operations. At the end of this analysis, we will have a vector with all source operations:and a mapping between each wire and its corresponding virtual qubit:
where
vn
denotes virtual qubit $v_n$.If the preconditions are met, the pass proceeds with placement and routing. This prototype focuses on the routing algorithm while implementing a simple “identity placement,” i.e., a virtual qubit $v_i$ will be placed on the physical qubit $p_i$. Hence, our initial placement is:
The mapper does routing by implementing the algorithm described in:
Routing starts with source operations collected during the analysis. These operations form a layer, called the font layer, which is a set of operations that have no unmapped predecessors. In the case of these source operations, the router only needs to iterate over the front layer while visiting all users of each operation. The processing of this front layer will create a new front layer containing all operations that have being visited the same number of times as their number of wire operands.
After processing the very first front layer, the algorithm proceeds to process the newly created front layer. Once again, it processes the front layer and map all operations that are compatible with the current placement, i.e., one-wire operations and two-wire operations using wires that correspond to qubits that are adjacently placed in the device. When an operation is successfully mapped, it is removed from the front layer and all its users are visited---those users that have no unmapped predecessors are added to the front layer. If the mapper cannot successfully map any operation in the front layer, then it adds a swap to the circuit and tries to map the front layer again. The routing process ends when the front layer is empty.
In our example, after processing the front layer with source operations, we have an updated front layer with only
quake.h
. Since it is a one-wire operation, we can directly map it and visit its users---which will update the front layer to only contain the firstquake.x
operation:Note that this
quake.x
uses two virtual qubits that are not mapped to adjacent device qubits, hence it fails to map. At this point, we need to modify the placement using swaps:The mapper choose to add a swap between
p0
andp1
. This will modify the placement to:With this change, it will be possible to map the operation that is in the front layer, its users will be visited and the front layer will be updated with the addition of the second
quake.x
operation. This operation uses two virtual qubits that are adjacently place, so it can be successfully mapped:After this the front layer will be empty, and the mapper is done. Here is the full debug information from the pass (using
debug-only=quantum-mapper
):TODO
quake.mx
,quake.my
,quake.mz
)quake.sink
)GraphCSR
(cudaq/ADT/GraphCSR.h
)cudaq/Support/Graph.h
)Device
(cudaq/Support/Device.h
)Placement
(cudaq/Support/Placment.h
)