Fix multiset ordering bug in perturbation module

Summary

This is a technical bug that I just discovered while trying to rerun the speed benchmarks for the perturbation theory algorithms paper that the perturbation module is based on. It turns out this bug has existed since this module was introduced (in version 0.3.0), however it was missed up until this point as:

It requires an example of sufficient complexity (>=10 perturbations), and
The speed benchmarks, which were the only example we had run with >=10 perturbations, were run using an old commit in which the bug hadn't yet been introduced.

The issue arises in the file perturbation/dyson_magnus.py, which contains a bunch of manipulation of lists of multisets. Many of the functions contain arguments that are assumed to be lists of multisets that are "ordered and complete". Here, "complete" means the list is closed under taking submultisets (i.e. for any multiset in the list, all sub multisets are in the list), and "ordered" means ordered according to a < b iff:

Either, len(a) < len(b), or
if len(a) == len(b), sortedlist(a) "<" sortedlist(b), where sortedlist(a) means a represented as a sorted list (with repeated entries), and "<" means lexicographic ordering of these lists. (Implicit in this is the assumption that the elements of a and b are themselves ordered, which is fine as we explicitly demand the multisets be composed of integers.)

This may seem kind of convoluted, but for technical reasons it is convenient to order in this way when constructing the perturbation theory computations. (More succinctly, the ordering within multisets of equal size is the "lexicographic" ordering of their sorted-list representation.)

The bug was ultimately introduced when we started using the external multiset package, as opposed to a Multiset class I had personally written. There is no problem with the multiset package itself, I had just made an error during the conversion in which I incorrectly implemented the above ordering criteria. The incorrect implementation was based on a string representation of the multiset, which resulted Multiset([10]) < Multiset([1]) == True. This led to these lists being incorrectly ordered when dealing with at least 10 perturbations, and ultimately led to malfunctioning of the rest of the code which assumes ordering as described above. The ordering, however, was fine so long as less than 10 perturbations were used, which is how this error went unnoticed for so long.

Details and comments

I have corrected the _sorted_multisets function in multiset_utils.py with the correct ordering. I've had to use a workaround in which a dummy class is defined that implements the __lt__ dunder method, and then the multisets are sorted using this as the key. This was suggested by @ihincks as the standard way to pass an ordering to the key argument of list.sort (that can't be represented as a single valued key function). I've also added a test that failed before this change, but passes now.

I'm currently labelling this as a draft as I want to take a bit of time to look through the code to see if any un-explicit ordering assumptions are being made. I think one example is line dyson_magnus.py line 838. It currently reads

            lmult_rule.append((np.array([1.0, 1.0]), np.array([[-1, term_idx], [term_idx, -1]])))

but I think it should be

            lmult_rule.append((np.array([1.0, 1.0]), np.array([[-1, term_idx], [perturbation_labels.index(term), -1]])))

This PR is ready for review.

Changes:

Corrects the multiset sorting function _sorted_multisets as described in the initial comment.
Modifies lines 837 and 838 in dyson_magnus.py along the lines as described in the initial comment. This ensures the correct index of the perturbation_labels is used in the lmult rule, and now only adds the rule if term is in perturbation_labels (otherwise it is actually unneeded).
Adds sorting tests for _sorted_multisets that would have caught the original bug.
Adds tests validating point 2 above:
- One checks correct exclusion of the unnecessary matrix multiplications.
- The other checks correct multiplication rule construction in perturbation_labels is not in "canonical" order (this is the only list of multisets that technically shouldn't be assumed to be in canonical order).
Adds an "integration" test that also would have caught the original bug. The test case test_dyson_analytic_case1_1d is one of the cheapest-to-compute test cases for solve_lmde_perturbation, and this PR adds test_dyson_analytic_case1_1d_relabeled, which reproduces this test but relabels the perturbations with indices 1 and 10, which is one case where the bug was expressed.

I've verified that all added tests fail if the changes in points 1 and 2 are reverted to what is currently in main.

qiskit-community / qiskit-dynamics

Fix multiset ordering bug in perturbation module #211

Summary

Details and comments