qiskit-community / qiskit-dynamics

Tools for building and solving models of quantum systems in Qiskit
https://qiskit-community.github.io/qiskit-dynamics/
Apache License 2.0
106 stars 60 forks source link

Fix multiset ordering bug in perturbation module #211

Closed DanPuzzuoli closed 1 year ago

DanPuzzuoli commented 1 year ago

Summary

This is a technical bug that I just discovered while trying to rerun the speed benchmarks for the perturbation theory algorithms paper that the perturbation module is based on. It turns out this bug has existed since this module was introduced (in version 0.3.0), however it was missed up until this point as:

  1. It requires an example of sufficient complexity (>=10 perturbations), and
  2. The speed benchmarks, which were the only example we had run with >=10 perturbations, were run using an old commit in which the bug hadn't yet been introduced.

The issue arises in the file perturbation/dyson_magnus.py, which contains a bunch of manipulation of lists of multisets. Many of the functions contain arguments that are assumed to be lists of multisets that are "ordered and complete". Here, "complete" means the list is closed under taking submultisets (i.e. for any multiset in the list, all sub multisets are in the list), and "ordered" means ordered according to a < b iff:

  1. Either, len(a) < len(b), or
  2. if len(a) == len(b), sortedlist(a) "<" sortedlist(b), where sortedlist(a) means a represented as a sorted list (with repeated entries), and "<" means lexicographic ordering of these lists. (Implicit in this is the assumption that the elements of a and b are themselves ordered, which is fine as we explicitly demand the multisets be composed of integers.)

This may seem kind of convoluted, but for technical reasons it is convenient to order in this way when constructing the perturbation theory computations. (More succinctly, the ordering within multisets of equal size is the "lexicographic" ordering of their sorted-list representation.)

The bug was ultimately introduced when we started using the external multiset package, as opposed to a Multiset class I had personally written. There is no problem with the multiset package itself, I had just made an error during the conversion in which I incorrectly implemented the above ordering criteria. The incorrect implementation was based on a string representation of the multiset, which resulted Multiset([10]) < Multiset([1]) == True. This led to these lists being incorrectly ordered when dealing with at least 10 perturbations, and ultimately led to malfunctioning of the rest of the code which assumes ordering as described above. The ordering, however, was fine so long as less than 10 perturbations were used, which is how this error went unnoticed for so long.

Details and comments

I have corrected the _sorted_multisets function in multiset_utils.py with the correct ordering. I've had to use a workaround in which a dummy class is defined that implements the __lt__ dunder method, and then the multisets are sorted using this as the key. This was suggested by @ihincks as the standard way to pass an ordering to the key argument of list.sort (that can't be represented as a single valued key function). I've also added a test that failed before this change, but passes now.

I'm currently labelling this as a draft as I want to take a bit of time to look through the code to see if any un-explicit ordering assumptions are being made. I think one example is line dyson_magnus.py line 838. It currently reads

            lmult_rule.append((np.array([1.0, 1.0]), np.array([[-1, term_idx], [term_idx, -1]])))

but I think it should be

            lmult_rule.append((np.array([1.0, 1.0]), np.array([[-1, term_idx], [perturbation_labels.index(term), -1]])))
DanPuzzuoli commented 1 year ago

This PR is ready for review.

Changes:

I've verified that all added tests fail if the changes in points 1 and 2 are reverted to what is currently in main.