Faster unit tests/notebooks

sserita commented 7 months ago

Is your feature request related to a problem? Please describe. The unit and notebook regression tests take too long.

Describe the solution you'd like Faster unit tests. This could be achieved by using different inputs or options which don't affect coverage but do affect runtime. Simple examples for GST related tests include using lower germ lengths, smaller gate sets, etc. This was already done for the test/test_packages/algorithms tests in #372, but we should do this across the whole codebase.

Task list Below are tasklists made by running pytest with --durations 0 and taking the most expensive tests.

Everything in test/unit around 5s or above:

[ ] 37.93s call test/unit/algorithms/test_gaugeopt.py::LGSTGaugeOptAllPenaltyTester::test_gaugeopt_with_none_gauge_group
[ ] 37.87s call test/unit/algorithms/test_gaugeopt.py::LGSTGaugeOptAllPenaltyTester::test_gaugeopt
[ ] 19.16s call test/unit/algorithms/test_gaugeopt.py::LGSTGaugeOptSPAMPenaltyTester::test_gaugeopt_with_none_gauge_group
[ ] 19.09s call test/unit/algorithms/test_gaugeopt.py::LGSTGaugeOptSPAMPenaltyTester::test_gaugeopt
[ ] 17.01s call test/unit/algorithms/test_gaugeopt.py::LGSTGaugeOptAutoMethodTester::test_gaugeopt_gates_metrics
[ ] 11.64s call test/unit/algorithms/test_gaugeopt.py::CPTPGaugeOptSPAMPenaltyTester::test_gaugeopt
[ ] 11.05s call test/unit/algorithms/test_gaugeopt.py::LGSTGaugeOptAllPenaltyTester::test_gaugeopt_no_target
[ ] 8.91s call test/unit/algorithms/test_gaugeopt.py::LGSTGaugeOptAutoMethodTester::test_gaugeopt_spam_metrics
[ ] 8.47s call test/unit/tools/test_optools.py::ProjectModelTester::test_logTiG_model_projection
[ ] 8.19s call test/unit/tools/test_optools.py::ProjectModelTester::test_logGTi_model_projection
[ ] 8.08s call test/unit/objects/test_fogi.py::FogiTester::test_crosstalk_free_fogi
[ ] 7.68s call test/unit/modelmembers/test_kraus_interface.py::KrausInterfaceStateVecSlowTester::test_depol_model_histogram
[ ] 7.53s call test/unit/algorithms/test_gaugeopt.py::CPTPGaugeOptAllPenaltyTester::test_gaugeopt
[ ] 7.03s call test/unit/modelmembers/test_kraus_interface.py::KrausInterfaceStateVecTester::test_depol_model_histogram
[ ] 6.88s call test/unit/tools/test_symplectic.py::SymplecticOddDimTester::test_fast_compose_cliffords
[ ] 5.58s call test/unit/protocols/test_protocols.py::ExperimentDesignTester::test_map_edesign_sslbls
[ ] 5.43s call test/unit/algorithms/test_gaugeopt.py::LGSTGaugeOptSPAMPenaltyTester::test_gaugeopt_no_target
[ ] 5.19s call test/unit/tools/test_edesigntools.py::ExperimentDesignTimeEstimationTester::test_time_estimation

__Everything in test/test_packages around/over a minute__:

[ ] 1114.06s call test/test_packages/algorithms/test_fiducialpairreduction.py::FiducialPairReductionTestCase::test_FPR_test_pairs
[ ] 1106.88s call test/test_packages/drivers/test_nqubit.py::NQubitTestCase::test_2Q
[ ] 1036.52s call test/test_packages/objects/test_hessian.py::TestHessianMethods::test_confidenceRegion
[ ] 761.97s call test/test_packages/report/test_report.py::TestReport::test_reports_logL_TP_wCIs
[ ] 560.06s call test/test_packages/drivers/test_drivers.py::TestDriversMethods::test_longSequenceGST_fiducialPairReduction
[ ] 537.51s call test/test_packages/drivers/test_timedep.py::TimeDependentTestCase::test_time_dependent_gst_staticdata
[ ] 436.46s call test/test_packages/drivers/test_timedep.py::TimeDependentTestCase::test_time_dependent_gst
[ ] 401.77s call test/test_packages/drivers/test_drivers.py::TestDriversMethods::test_longSequenceGST_randomReduction
[ ] 394.28s call test/test_packages/drivers/test_calcmethods1Q.py::CalcMethods1QTestCase::test_stdgst_map
[ ] 368.58s call test/test_packages/reportb/test_workspace.py::TestWorkspace::test_table_creation
[ ] 328.44s call test/test_packages/algorithms/test_fiducialpairreduction.py::FiducialPairReductionTestCase::test_memlimit
[ ] 285.68s call test/test_packages/algorithms/test_fiducialpairreduction.py::FiducialPairReductionTestCase::test_intelligentFiducialPairReduction
[ ] 280.79s call test/test_packages/algorithmsb/test_germselection.py::GermSelectionTestCase::test_germsel_greedy
[ ] 168.63s call test/test_packages/algorithmsb/test_germselection.py::GermSelectionTestCase::test_germsel_grasp
[ ] 149.78s call test/test_packages/algorithms/test_fogi_gst.py::CrosstalkFreeFOGIGSTTester::test_fogi_gst
[ ] 135.69s call test/test_packages/report/test_report.py::TestReport::test_reports_logL_TP_noCIs
[ ] 124.78s call test/test_packages/report/test_report.py::TestReport::test_reports_chi2_noCIs
[ ] 91.54s call test/test_packages/drivers/test_calcmethods1Q.py::CalcMethods1QTestCase::test_stdgst_matrix
[ ] 88.79s call test/test_packages/iotest/test_niceserialization.py::NiceSerializationTester::test_nice_serialization
[ ] 66.69s call test/test_packages/drivers/test_drivers.py::TestDriversMethods::test_longSequenceGST_CPTP
[ ] 59.41s call test/test_packages/reportb/test_workspace.py::TestWorkspace::test_plot_creation

__Everything in jupyter_notebooks around/over a minute__:

[ ] 1448.42s call jupyter_notebooks/Examples/QutritGST.ipynb::Cell 8
[ ] 1001.70s call jupyter_notebooks/Tutorials/algorithms/DriftCharacterization.ipynb::Cell 2
[ ] 635.88s call jupyter_notebooks/Tutorials/algorithms/DriftCharacterization.ipynb::Cell 16
[ ] 444.38s call jupyter_notebooks/Tutorials/algorithms/advanced/Time-dependent-GST.ipynb::Cell 12
[ ] 364.99s call jupyter_notebooks/Tutorials/algorithms/advanced/GST-FiducialPairReduction.ipynb::Cell 2
[ ] 351.95s call jupyter_notebooks/Tutorials/algorithms/ModelTesting-functions.ipynb::Cell 7
[ ] 337.69s call jupyter_notebooks/Tutorials/algorithms/advanced/CliffordRB-Simulation-ImplicitModel.ipynb::Cell 1
[ ] 303.87s call jupyter_notebooks/Tutorials/algorithms/GST-Overview-functionbased.ipynb::Cell 1
[ ] 299.93s call jupyter_notebooks/Tutorials/algorithms/GST-Overview.ipynb::Cell 3
[ ] 268.60s call jupyter_notebooks/Tutorials/reporting/ReportGeneration.ipynb::Cell 8
[ ] 263.79s call jupyter_notebooks/Examples/BootstrappedErrorBars.ipynb::Cell 4
[ ] 247.16s call jupyter_notebooks/Tutorials/algorithms/ModelTesting.ipynb::Cell 6
[ ] 246.86s call jupyter_notebooks/Tutorials/00-Protocols.ipynb::Cell 1
[ ] 218.24s call jupyter_notebooks/Tutorials/algorithms/RB-Overview.ipynb::Cell 2
[ ] 206.88s call jupyter_notebooks/Tutorials/algorithms/RB-DirectRB.ipynb::Cell 6
[ ] 199.03s call jupyter_notebooks/Examples/CirqIntegration.ipynb::Cell 15
[ ] 189.51s call jupyter_notebooks/Examples/Leakage.ipynb::Cell 10
[ ] 177.25s call jupyter_notebooks/Tutorials/algorithms/GST-Driverfunctions.ipynb::Cell 6
[ ] 174.92s call jupyter_notebooks/Tutorials/algorithms/DriftCharacterization.ipynb::Cell 14
[ ] 172.61s call jupyter_notebooks/Tutorials/algorithms/GST-Driverfunctions.ipynb::Cell 8
[ ] 157.32s call jupyter_notebooks/Examples/BootstrappedErrorBars.ipynb::Cell 7
[ ] 155.76s call jupyter_notebooks/Tutorials/algorithms/GST-Protocols.ipynb::Cell 7
[ ] 149.74s call jupyter_notebooks/Tutorials/reporting/WorkspaceExamples.ipynb::Cell 22
[ ] 138.51s call jupyter_notebooks/Tutorials/algorithms/GST-Protocols.ipynb::Cell 9
[ ] 130.57s call jupyter_notebooks/Tutorials/algorithms/ModelTesting.ipynb::Cell 7
[ ] 123.51s call jupyter_notebooks/Tutorials/reporting/ReportGeneration.ipynb::Cell 5
[ ] 118.61s call jupyter_notebooks/Tutorials/objects/ExperimentDesign.ipynb::Cell 11
[ ] 116.87s call jupyter_notebooks/Examples/CirqIntegration.ipynb::Cell 17
[ ] 114.81s call jupyter_notebooks/Examples/Leakage.ipynb::Cell 13
[ ] 112.85s call jupyter_notebooks/Tutorials/algorithms/GST-Driverfunctions.ipynb::Cell 10
[ ] 112.47s call jupyter_notebooks/Tutorials/algorithms/GST-Protocols.ipynb::Cell 11
[ ] 110.25s call jupyter_notebooks/Examples/Leakage.ipynb::Cell 9
[ ] 107.54s call jupyter_notebooks/Tutorials/00-Protocols.ipynb::Cell 2
[ ] 104.39s call jupyter_notebooks/Tutorials/algorithms/ModelTesting-functions.ipynb::Cell 5
[ ] 101.97s call jupyter_notebooks/Tutorials/algorithms/ModelTesting.ipynb::Cell 4
[ ] 81.98s call jupyter_notebooks/Tutorials/00-Protocols.ipynb::Cell 5
[ ] 80.34s call jupyter_notebooks/Tutorials/algorithms/RB-MultiRBExperiments.ipynb::Cell 7
[ ] 79.85s call jupyter_notebooks/Tutorials/algorithms/advanced/GST-FiducialPairReduction.ipynb::Cell 7
[ ] 76.19s call jupyter_notebooks/Tutorials/algorithms/ModelTesting-functions.ipynb::Cell 8
[ ] 71.77s call jupyter_notebooks/Tutorials/algorithms/RB-MirrorRB-Universal-Gate-Sets.ipynb::Cell 4
[ ] 68.32s call jupyter_notebooks/Tutorials/algorithms/advanced/CliffordRB-Simulation-ExplicitModel.ipynb::Cell 1
[ ] 65.20s call jupyter_notebooks/Examples/BootstrappedErrorBars.ipynb::Cell 5
[ ] 64.29s call jupyter_notebooks/Tutorials/reporting/ReportGeneration.ipynb::Cell 10
[ ] 64.03s call jupyter_notebooks/Tutorials/algorithms/MirrorCircuitBenchmarks.ipynb::Cell 5
[ ] 63.42s call jupyter_notebooks/Tutorials/reporting/ReportGeneration.ipynb::Cell 9
[ ] 63.10s call jupyter_notebooks/Tutorials/algorithms/advanced/GST-FiducialPairReduction.ipynb::Cell 8
[ ] 58.42s call jupyter_notebooks/Tutorials/algorithms/GST-Protocols.ipynb::Cell 19

coreyostrove commented 7 months ago

Thanks for compiling this, @sserita! Regarding the contents of test packages, I anticipate a few of these being addressed as part of #373. The changes on that branch broke some of the unit tests in that module, so I decided that since I was in the code already fixing them to also streamline some of them at the same time.

sserita commented 7 months ago

Great! Whenever #373 is ready, I'll go through and check off whichever ones you've already gotten to.

coreyostrove commented 7 months ago

Question regarding the runtimes above. What system and pyGSTi version were these timings collected on? I.e. was this done locally on your system (presumably an apple silicon macbook) or on the github runners? I ask because I noticed some of the runtimes were significantly longer than I had recalled seeing on my system (windows laptop with xeon W-10855M processor) following the updates in #372.

For example, looking at the second test in the test/unit (test_gaugeopt.py::LGSTGaugeOptSPAMPenaltyTester::test_gaugeopt_no_target) this takes just 8.14s on my system, so ~16x faster (it is indeed still one of the longer tests, so that is still consistent). The M-series macbooks are pretty spiffy, so while I wouldn't expect exactly the same behavior (I actually would've expected it to be faster than my system), 16x seems a bit larger than I would have expected.

I am running this on the bugfix-stricter-line-label-enforcement branch, but as far as I recall haven't made any changes to the tests in question since #372.

sserita commented 7 months ago

The test/unit was run on my MacBook Pro, the other two were run on Dynobird. I did run them with the --dist loadscope flag, so that could be a potential difference in the timings if you are just running them serially but with more cores per test?

Edit: I just ran this without --dist loadscope and got 4.6s for that test instead. Maybe I will rerun these without distributed flag for more consistent timings, and update the tasks.

coreyostrove commented 7 months ago

I was running these on a single thread so didn't have any dist flags set. But we do use that flag on the runners (and probably should based on what I understand about it) so maybe I should? If I recall correctly I think I had previously decided that flag is probably more relevant when doing multi-threaded notebook regressions than the standard unit tests. The notebook regression tests really need to have all of the cells in a notebook run on a single worker, whereas typically the standard unit tests should be generally more robust in this regard.

Edit: I just re-ran the tests on my machine with multiple threads and the --dist loadscope flag set and it did not meaningfully change the timings. Maybe there is something weird going on with pytest xdist on macs?

sserita commented 7 months ago

Yea agreed something seems wonky - or maybe I was running something else at the same time in the background. Timings are updated - do those seem more reasonable across the board? This was no --dist loadscope and on Dynobird now.

coreyostrove commented 7 months ago

I just spot checked a few of the updated timings and these seem much more reasonable.

coreyostrove commented 7 months ago

One thing that might be worth looking into in the course of trying to speed up the testing process is our current list of dependencies installed using the testing flag. A good chunk of the overall time is setting up the runners and installing packages, so removing any unneeded dependencies would be helpful here.

coreyostrove commented 5 months ago

With #373 merged in I re-ran the unit tests to get updated timings and see what bottlenecks were still unresolved. Note that I cheated a bit and re-ran these on songthrush (dynobird and quail were in use), so these aren't a perfect 1-for-1 comparison, but should hopefully be close enough. Ran tests on a single thread (though some of the linear algebra stuff was clearly utilizing openMP based on htop).

Everything in test/unit around 5s or above:

[x] 10.33s call test/unit/tools/test_edesigntools.py::ExperimentDesignTimeEstimationTester::test_time_estimation
[ ] 7.55s call test/unit/tools/test_optools.py::ProjectModelTester::test_logGTi_model_projection
[ ] 6.93s call test/unit/objects/test_fogi.py::FogiTester::test_crosstalk_free_fogi
[ ] 6.45s call test/unit/modelmembers/test_kraus_interface.py::KrausInterfaceStateVecSlowTester::test_depol_model_histogram
[ ] 6.36s call test/unit/tools/test_optools.py::ProjectModelTester::test_logTiG_model_projection
[ ] 5.98s call test/unit/modelmembers/test_kraus_interface.py::KrausInterfaceStateVecTester::test_depol_model_histogram
[ ] 5.77s call test/unit/tools/test_symplectic.py::SymplecticOddDimTester::test_fast_compose_cliffords
[ ] 5.21s call test/unit/construction/test_modelconstruction.py::ModelConstructionTester::test_instruments_in_processorspecs
[ ] 5.21s call test/unit/protocols/test_protocols.py::ExperimentDesignTester::test_map_edesign_sslbls

Everything in test/test_packages around/over 5s (note previous list was over a minute)

[x] 210.94s call test/test_packages/objects/test_hessian.py::TestHessianMethods::test_confidenceRegion
[ ] 68.69s call test/test_packages/report/test_report.py::TestReport::test_reports_logL_TP_wCIs
[x] 31.86s call test/test_packages/algorithms/test_fogi_gst.py::CrosstalkFreeFOGIGSTTester::test_fogi_gst
[ ] 31.37s call test/test_packages/reportb/test_colormaps.py::ColormapTests::test_mpl_conversion
[ ] 29.51s call test/test_packages/iotest/test_niceserialization.py::NiceSerializationTester::test_nice_serialization
[x] 23.63s call test/test_packages/drivers/test_timedep.py::TimeDependentTestCase::test_time_dependent_gst
[ ] 19.38s call test/test_packages/reportb/test_workspace.py::TestWorkspace::test_table_creation
[ ] 16.23s call test/test_packages/algorithms/test_fiducialpairreduction.py::FiducialPairReductionTestCase::test_FPR_test_pairs
[ ] 14.99s call test/test_packages/report/test_report.py::TestReport::test_reports_chi2_noCIs
[x] 14.12s call test/test_packages/tools/test_logl.py::LogLTestCase::test_memory
[ ] 10.97s call test/test_packages/report/test_report.py::TestReport::test_reports_logL_TP_noCIs
[x] 9.36s call test/test_packages/objects/test_hessian.py::TestHessianMethods::test_mapcalc_hessian
[x] 9.10s call test/test_packages/drivers/test_drivers.py::TestDriversMethods::test_longSequenceGST_fiducialPairReduction
[ ] 9.08s call test/test_packages/drivers/test_nqubit.py::NQubitTestCase::test_2Q
[x] 8.87s call test/test_packages/drivers/test_drivers.py::TestDriversMethods::test_stdpracticeGST
[x] 8.74s call test/test_packages/drivers/test_drivers.py::TestDriversMethods::test_longSequenceGST_CPTP
[ ] 7.68s call test/test_packages/report/test_report.py::TestReport::test_reports_multiple_ds
[ ] 7.21s call test/test_packages/drivers/test_nqubit.py::NQubitTestCase::test_3Q
[ ] 6.87s call test/test_packages/drivers/test_calcmethods1Q.py::CalcMethods1QTestCase::test_stdgst_map
[x] 5.93s call test/test_packages/drivers/test_drivers.py::TestDriversMethods::test_longSequenceGST_HplusS
[x] 5.73s call test/test_packages/drivers/test_drivers.py::TestDriversMethods::test_StandardGST_checkpointing
[x] 5.34s call test/test_packages/drivers/test_drivers.py::TestDriversMethods::test_longSequenceGST_GLND
[ ] 5.23s call test/test_packages/report/test_report.py::TestReport::test_reports_chi2_wCIs
[ ] 5.10s call test/test_packages/drivers/test_nqubit.py::NQubitTestCase::test_sequential_sequenceselection

Everything in jupyter_notebooks around/over 45 seconds:

[x] 770.66s call jupyter_notebooks/Examples/QutritGST.ipynb::Cell 8
[x] 539.32s call jupyter_notebooks/Tutorials/algorithms/DriftCharacterization.ipynb::Cell 16
[x] 458.70s call jupyter_notebooks/Tutorials/algorithms/DriftCharacterization.ipynb::Cell 2
[x] 137.44s call jupyter_notebooks/Tutorials/algorithms/DriftCharacterization.ipynb::Cell 14
[x] 87.28s call jupyter_notebooks/Tutorials/algorithms/ModelTesting-functions.ipynb::Cell 7
[ ] 77.11s call jupyter_notebooks/Tutorials/algorithms/advanced/GST-FiducialPairReduction.ipynb::Cell 2
[x] 76.53s call jupyter_notebooks/Tutorials/algorithms/advanced/Time-dependent-GST.ipynb::Cell 12
[x] 69.98s call jupyter_notebooks/Tutorials/algorithms/advanced/CliffordRB-Simulation-ImplicitModel.ipynb::Cell 1
[x] 67.31s call jupyter_notebooks/Tutorials/algorithms/GST-Overview.ipynb::Cell 3
[x] 65.84s call jupyter_notebooks/Examples/BootstrappedErrorBars.ipynb::Cell 4
[x] 65.13s call jupyter_notebooks/Tutorials/algorithms/GST-Overview-functionbased.ipynb::Cell 1
[x] 55.19s call jupyter_notebooks/Tutorials/00-Protocols.ipynb::Cell 1
[ ] 54.92s call jupyter_notebooks/Tutorials/reporting/ReportGeneration.ipynb::Cell 8
[x] 53.23s call jupyter_notebooks/Tutorials/algorithms/ModelTesting.ipynb::Cell 6
[ ] 50.82s call jupyter_notebooks/Tutorials/reporting/ReportGeneration.ipynb::Cell 10
[x] 50.66s call jupyter_notebooks/Tutorials/algorithms/GST-Driverfunctions.ipynb::Cell 8
[x] 46.51s call jupyter_notebooks/Examples/BootstrappedErrorBars.ipynb::Cell 7

I have started taking a crack at some of these on a new branch called 'feature-faster-testing-stuff', and will check things off as I make progress.

sserita commented 3 months ago

This is mostly completed with the merge of #403.

sandialabs / pyGSTi

Faster unit tests/notebooks #380