sandialabs / pyGSTi

A python implementation of Gate Set Tomography
http://www.pygsti.info
Apache License 2.0
132 stars 55 forks source link

IBMQExperiment and FreeformDesign/CombinedExperimentDesign quality-of-life updates #379

Open sserita opened 7 months ago

sserita commented 7 months ago

This PR makes several quality-of-life improvements for IBMQExperiment, including checkpointing for IBMQ experiments mentioned in #327, as well as FreeformDesign and CombinedExperimentDesign serialization.

IBMQExperiment updates:

  1. The IBMQExperiment class is changed from a dict to a full class that inherits from TreeNode, making it commensurate with serialization of objects such as ExperimentDesign and ProtocolData.
  2. With a more granular serialization strategy, it is now possible to checkpoint IBMQExperiment objects. The API follows GST checkpointing where checkpoint_path and disable_checkpointing can be passed to the constructor (checkpointing is on by default). If checkpoint_path is provided, then transpile(), submit(), or retrieve_results will update the on-disk ibmqexperiment when possible.
  3. If bson is installed, we can use the json_util hook to handle objects such as datetime.datetime which are present in submit_time_calibration_data and batch_results from the IBMQ server. Depending on the availability of bson, we can serialize to all text/json files, but fall back to pickle still.

FreeformDesign updates:

  1. The aux_info member of FreeformDesign is written as JSON instead of pickle. To do this, first we cast the Circuit keys as strings and then write the now-JSONable dictionary to file. We convert back to Circuits on deserialization.
  2. More controversially, I skip serializing all_circuits_needing_data for FreeformDesign. This field should match the keys of aux_info, and with the size of these designs, it saves quite a lot of space to not double save them (almost 1 GB for some of my SVB use cases!).

CombinedExperimentDesign updates:

  1. The constructor now takes an additional skip_writing_all_circuits flag. The intention is that in cases where all_circuits_needing_data is simply the union of subdesigns, this can be regenerated on the fly. This saves space on disk and cuts the write/read times for the largest thing being written (cutting my 1 GB down to ~500 MB for my SVB use cases!).

I've done my best to make it backwards-compatible; for example, we can still load old pickle-style IBMQExperiment directories into the new object. However, the IBMQExperiment workflow has to change a little to enable checkpointing - a minor pain point that I hope is well worth it for most users.

Remaining Tasks

After a round of feedback, here's the current list of additional tasks:

rileyjmurray commented 7 months ago

@sserita, I got a ping about reviewing this due to being a code owner. From what I can see this doesn't touch code that I have prior experience with. Maybe a change in my team in the CODEOWNERS file is in order?

coreyostrove commented 7 months ago

Great work, @sserita! I posted a couple of code specific comments above, but had a couple of broader comments to go along with them. I think the main thing that this PR is missing is some unit tests. As far as I can tell we currently have no coverage of the IBMQExperiment in any of our test modules. More and more users are reliant on this code so this would be a good opportunity to at least start to fill that gap by adding some tests for the new checkpointing functionality. Thoughts on this? We obviously don't want to have the runners making a bunch of API calls to IBM's servers as part of testing, but I took a look at the code and at first glance it looks like the transpile method doesn't make any external API calls, so this could be a good candidate for testing. I forget off-hand whether using the simulator backend runs the simulations locally or on their cloud, but if it is locally that might be a good option for extending coverage to things like the submit method.

Relatedly, while we have some coverage on serialization for experiment designs, it looks like presently we don't have have anything for FreeformDesign (we actually only have one test that touched FreeformDesign period, and that is related to a qubit label mapping method), so this could be a good thing to add. Ditto with the new CombinedExperimentDesign serialization option (for skipping all_circuits).

sserita commented 7 months ago

Thanks for the review @coreyostrove! I will go back and make these changes, and add a bunch of unit tests. However, testing IBMQExperiment is a bit of a problem - even the simulator hits the API. And I can make a test for transpile(), but that's really a relatively minor part of the code that could break in this class... I've been avoiding thinking about it because that sounds like non-trivial testing infra to build out, but this is probably one of the better times to consider it... 😅

Edit: Just found this: https://docs.quantum.ibm.com/api/qiskit/providers_fake_provider. IBMQ tests incoming 😄

sserita commented 5 months ago

FYI, this is being delayed until 0.9.13 so that I can also incorporate the new IBMQ session workflow, which will eventually become the default way to run on IBMQ.