Closed TysonRayJones closed 5 days ago
Presently, QuEST's distributed unit tests are not run on Github CI, which would limit them to 2
nodes.
They are instead manually performed on up to 16
nodes on our office workstations, or more nodes on our supercomputers, before each release.
There are plans to employ a private machine for distributed testing, to use more nodes and longer timeouts, and integrate it with Github CI.
But clearly in addition to this, we must develop a new suite of distributed tests using larger density matrices, which can access the algorithmic regimes of 'fewer than one column per node'. It will likely be prohibitve to test all functions in this new bespoke test suite; perhaps those not tested must accompany a formal proof of their correctness (like in the distributed manuscript currently in preparation).
This new suite of tests could be scheduled to run on the Oxford ARCUS supercomputers for one week, launched every ~3 months, or triggered manually before a QuEST release. The tests could be randomised, but each time with a unique random seed, to widen the coverage between tests.
In v4, density-matrix quregs are restricted (as per d179d836db1205abd3ac1abc23bdcdefb966cfc7) from being distributed such that a single column spans multiple nodes. This turns out to be a ridiculous and unperformant regime, and violates the preconditions of many algorithms
Presently, the unit tests use only Quregs of a fixed size;
5
qubits (both for state-vectors and density matrices). This is for several reasons:#qubits
, although5
qubits is still lightning fast, but...5
qubits is the threshold above which the number of inputs to functions likemixMultiQubitKrausMap
explode intractably. Hence, systematic testing of the QuEST API for6+
qubits is infeasible.This imposes a limit on the number of nodes which can distribute our unit tests , since there can be no fewer than
1
amplitude per node. Hence, we cannot distribute5
qubit state-vectors between more than32
nodes. Furthermore, despite that we could in-principle distribute5
qubit density-matrices between up to1024
nodes, the current unit tests require the state-vectors and density-matrices to contain the same number of qubits. This is mandatory in some functions likecalcFidelity
, but in others, is just a result of software structure and code cleanliness.This means our unit tests can be run on a maximum of
32
nodes - and this is a big problem for our density matrix tests. A5
qubit density matrix distributed between32
nodes contains exactly one column of amplitudes on every node. When distributed between fewer nodes, each node will contain more than one column. But since we cannot employ more than32
nodes, we cannot access the simulation regime where a node contains fewer than one column (e.g. a half column, or quarter).Although a somewhat esoteric deployment pattern in real simulations, this is a very very important regime to test, since it can drastically change the nature of the distributed algorithm. In fact, issues #323, #324, #325 and #326 relate to algorithmic errors due to a lack of consideration of this regime - and they were undiscovered because the unit tests which would otherwise expose these problems cannot be run at
>= 64
nodes!Some functions, like
mixTwoQubitDepolarisingChannel
, have (when corrected) distinct algorithmic regimes for when each node contains 'fewer than a quarter column', 'fewer than a half column', and 'one or more columns'. So it is absolutely crucial we test all these regimes! Furthermore, the regime where a node contains fewer than a column often leads to more complicated algorithms. This has meant (as was the case in the above issues) that some QuEST distributed algorithms are erroneously conceived geometrically with a>= 1
column-per-node picture in mind. This dooms the algorithm to unknowingly fail on a distribute regime we cannot yet unit test.Presently, the unit tests consult a
NUM_QUBITS
pre-processor macro for the Qureg size. Indeed this is easy to modify in-code, or overwrite at compilation time. However, we cannot simply scale up all our systematic testing to higher qubit numbers, since the possible input space and hence runtime explodes. It is tempting to propose randomised testing of some functions at some higher qubit numbers. But as alluded to in #324, some functions can have distinct algorithmic regimes which are very rare, activated by very obscure circumstances; as#qubits
increase, they become increasingly unlikely to be triggered in random testing. It is true that in such a case, the author of the function can recognise the distinct algorithmic regimes, and ensure they are all covered by bespoke higher-qubit tests; but sometimes the author (like myself) may not realise the need for distinct regimes, and hence not create these bespoke tests (like in #326).Correcting this limitation in our current distributed unit tests appears quite a challenge! I leave this post to describe only the problem, so that solution proposals may appear below.