Closed DanPuzzuoli closed 1 year ago
This PR is ready for review.
Changes:
_sorted_multisets
as described in the initial comment.dyson_magnus.py
along the lines as described in the initial comment. This ensures the correct index of the perturbation_labels
is used in the lmult rule, and now only adds the rule if term
is in perturbation_labels
(otherwise it is actually unneeded)._sorted_multisets
that would have caught the original bug.perturbation_labels
is not in "canonical" order (this is the only list of multisets that technically shouldn't be assumed to be in canonical order).test_dyson_analytic_case1_1d
is one of the cheapest-to-compute test cases for solve_lmde_perturbation
, and this PR adds test_dyson_analytic_case1_1d_relabeled
, which reproduces this test but relabels the perturbations with indices 1
and 10
, which is one case where the bug was expressed.I've verified that all added tests fail if the changes in points 1 and 2 are reverted to what is currently in main
.
Summary
This is a technical bug that I just discovered while trying to rerun the speed benchmarks for the perturbation theory algorithms paper that the perturbation module is based on. It turns out this bug has existed since this module was introduced (in version 0.3.0), however it was missed up until this point as:
The issue arises in the file
perturbation/dyson_magnus.py
, which contains a bunch of manipulation of lists of multisets. Many of the functions contain arguments that are assumed to be lists of multisets that are "ordered and complete". Here, "complete" means the list is closed under taking submultisets (i.e. for any multiset in the list, all sub multisets are in the list), and "ordered" means ordered according to a < b iff:a
represented as a sorted list (with repeated entries), and "<" means lexicographic ordering of these lists. (Implicit in this is the assumption that the elements ofa
andb
are themselves ordered, which is fine as we explicitly demand the multisets be composed of integers.)This may seem kind of convoluted, but for technical reasons it is convenient to order in this way when constructing the perturbation theory computations. (More succinctly, the ordering within multisets of equal size is the "lexicographic" ordering of their sorted-list representation.)
The bug was ultimately introduced when we started using the external
multiset
package, as opposed to aMultiset
class I had personally written. There is no problem with themultiset
package itself, I had just made an error during the conversion in which I incorrectly implemented the above ordering criteria. The incorrect implementation was based on a string representation of the multiset, which resultedMultiset([10]) < Multiset([1]) == True
. This led to these lists being incorrectly ordered when dealing with at least 10 perturbations, and ultimately led to malfunctioning of the rest of the code which assumes ordering as described above. The ordering, however, was fine so long as less than 10 perturbations were used, which is how this error went unnoticed for so long.Details and comments
I have corrected the
_sorted_multisets
function inmultiset_utils.py
with the correct ordering. I've had to use a workaround in which a dummy class is defined that implements the__lt__
dunder method, and then the multisets are sorted using this as the key. This was suggested by @ihincks as the standard way to pass an ordering to thekey
argument oflist.sort
(that can't be represented as a single valued key function). I've also added a test that failed before this change, but passes now.I'm currently labelling this as a draft as I want to take a bit of time to look through the code to see if any un-explicit ordering assumptions are being made. I think one example is line dyson_magnus.py line 838. It currently reads
but I think it should be