Bugfix stricter line label enforcement

coreyostrove commented 8 months ago

Part 1: Stricter Line Label Enforcement

Herein lies an excessively long explanation for a relatively simple proposed change. This PR addresses an (apparent) bug that a number of folks (myself included) have intermittently run into over the past year or two (this is the one I was referring to in the recent discussion thread for #369 here.

In a nutshell, occasionally when creating GST experiment designs it was observed that some circuits in the generated list would include line labels of the form (0, '*'), or the like. This typically went left undetected until after either some amount of confusion from experimental colleagues trying to run these experiment designs, or when we got a data set back and there were errors that arose due to mismatched circuits in the DataSet object. In a nutshell this was happening when some circuit among the experiment design constituents (typically fiducials) had the default '*' placeholder value for their line labels. Most commonly this circuit was the empty circuit which is often specified (perhaps sloppily) as Circuit([]) (which produces the aforementioned placeholder line label). In some circumstances this was detected and caught by the experiment design generation code, as in the below example:

import pygsti
from pygsti.modelpacks import smq1Q_XY
from pygsti.circuits import Circuit
from pygsti.baseobjs import Label
from pygsti.protocols import StandardGSTDesign
prep_fids = smq1Q_XY.prep_fiducials()
meas_fids = smq1Q_XY.meas_fiducials()
germs = smq1Q_XY.germs(lite = True)
target_model = smq1Q_XY.target_model('full TP')
prep_fids[0] = Circuit([])
test_design = StandardGSTDesign(target_model, prep_fids, meas_fids, germs, [1,2])

Which raises the error:

 ValueError: line labels must contain at least {0}

But not if we instead did

<insert previous block of code up to this point>
meas_fids[0] = Circuit([])
test_design = StandardGSTDesign(target_model, prep_fids, meas_fids, germs, [1,2])

Which does not catch this and instead creates circuits with the aforementioned mixed line labels. I'm sure there are other circumstances in which this occurs and isn't caught too.

This design considerations behind the proper use of the '*' label is, as far as I can tell, not discussed anywhere in the documentation aside from a couple of sentences in the Circuit.ipynb tutorial notebook. There it is stated:

In the case of 1 or 2 qubits, it can be more convenient to dispense with the line labels entirely and just equate gates with circuit layers and represent them with simple Python strings. If we initialize a Circuit without specifying the line labels (either by line_labels or by num_lines) and the layer labels don't contain any non-None line labels, then a Circuit is created which has a single special *''-line* which indicates that this circuit doesn't contain any explicit lines.... Using circuits with ``''-lines is fine (and is the default type of circuit for the "standard" 1- and 2-qubit modules located at pygsti.construction.std\*); one just needs to be careful not to combine these circuits with other non-'*'``-line circuits.

Circling back now to the contents of this PR, this adds an explicit check to the the __add__ method for Circuit objects to enforce the intended design consideration described in the tutorial. This check verifies that the two circuits being added have compatible line labeling, and raised a ValueError if they do not. I.e. if one circuit as a line label value of ('*',) and the other (0,) or some such like that. The error message includes a detailed description of how to address the problem, and a proper docstring has been added to this method in order to explain this in the documentation.

There is an exception to this that I've added for handling the special case of the global idle circuit. In this special convention the label Label(()) is interpreted as a global idle operation. To handle this case the add method checks whether the either of the circuits being added consists of solely global idles (circuits of this form have the default '*' line labeling). If so, and the other circuit has non-'*' line labeling then we inherit the line-labels from the other circuit. e.g.

Circuit([Label(())]) + Circuit([Label('Gxpi2',0)], line_labels= (0,)) -> Circuit([Label(())], Label('Gxpi2',0)], line_labels= (0,))

For what it is worth, I think part of the issue that led to this not being immediately identified is that I and some others misinterpreted the expected behavior of circuits with '*' line labels. In many contexts the '*' is used to signify a wildcard, and for whatever reason I had always (incorrectly) assumed that meant that circuits with these line labels were meant to (somehow) adapt their line labels to those of the circuits they were added to. In hindsight that probably doesn't make much sense of most circuits---or at very least the logic needed to handle the reconciliation when dealing with real gates would be a beast---but I suppose for the empty circuit where these are most likely to be encountered these days it wouldn't be so strange for that to have been the case.

Unsurprisingly this change broke a bunch of code where we had previously been sloppy about enforcing this compatibility. So also included in this PR are a number of bugfixes in the actual codebase and modifications to the unit tests.

Part 2: Faster Unit Tests in test_packages

As mentioned in part 1 of this PR, this change necessitated a number of modifications to the unit tests, particularly in test_packages, to address areas where we'd previously been sloppy about enforcing the proper behavior. Since I was already knee deep in the testing code I used this as an opportunity to also make a number of performance tweaks vis-a-vis #380, as well as a number of other changes meant to modernize the tests and address various warnings and deprecations. There are too many changes to exhaustively list, but some highlights include:

Switched many instances in which we were using legacy modelpacks to their modern counterparts.
Switched to smaller experiment design variants in many instances, including using the lite germ set and a minimally informationally complete set of preparation and measurement fiducials, and smaller gate sets where appropriate (which significantly reduces computational time).
Moved many instances in which the legacy json serialization was being used to use the modern json serialization pathway. Plus found and fixed a bug in the serialization code in the course of doing so.
Removed the test_autoexperimentdesign module. Most of these were redundant (i.e. testing the exact same things as what was in the experiment design tests in unit). For the couple non-redundant tests remaining I relocated these into other modules as appropriate.
Refactored some code in fiducialpairreduction.py to reduce the amount of code repetition.

@enielse, I also made a small change to some of the code in cloudcircuitconstruction.py that it would be good to have your eyes on as the expert on that part of the code base.

sserita commented 7 months ago

@coreyostrove I'm seeing recent activity here + (an old) request for my review. Is this ready to go out of draft phase/final review, or you just wanted comments on the in-progress PR?

coreyostrove commented 7 months ago

Feedback is definitely welcome at this stage, but this is still a work-in-progress on my end. I admittedly was under the (incorrect) impression when I first made this that being a draft PR meant this was only visible to me, and that it'd become generally visible (and that the reviewer notifications would go out) after I had published it.

I am coincidentally pretty close to having this ready so I'll be publishing it soon, but in any case this is definitely something for post-0.9.12.0 (in case that was implicitly on your mind) as I'd like to let this live on develop for a while. I'm anticipating this change might break some things that are not going to get caught by unit test that only some regular use by folks working off of develop will suss out.

coreyostrove commented 7 months ago

Thanks for the feedback on this PR so far. I've just marked this as ready for review. I'll note that in addition to the passing tests indicated here, I have also manually run the workflows for the test_packages module and for the notebook regression tests on the runners and those are all currently passing as well.

sandialabs / pyGSTi

Bugfix stricter line label enforcement #373