entropy-lab / entropy

BSD 3-Clause "New" or "Revised" License
30 stars 13 forks source link

[BUG][HIGH priority] InProcessParamStore commit (and probably QMM: open_qm) are very slow #323

Open xueyue-sherry-zhang opened 2 years ago

xueyue-sherry-zhang commented 2 years ago

Describe the bug From the cProfile report, the time it takes to commit the param_store is on the same order of magnitude as the qm.simulate. And the time it takes to commit gets longer as the history of the param_store gets longer.

Detailed description I have three different graphs to profile. All of them contain ten parallel qubit spectroscopy nodes. The QUA program and config are similar for the three graphs. The major difference is in the database structure: Graph (i): it runs on entropylab v0.1.2, entropylab-qpudb v0.0.11 and the database is qpu_db. Graph (ii): it runs on entropylab v0.15.3 with InProcessParamStore. The db file is fresh and after the run of the graph, the size is 444 KB. Graph (iii): same as (ii), except that the db file already has many commits and the size is 2.6 MB. After using the cProfile and gprof2dot to visualize the runtime of each function, the key parts look like the following: Graph (i) image Graph (ii) image Graph (iii) image I'm simulating the same QUA program on the same set of OPXs, so I think qm.simulate() takes similar amount of time for all three graphs. What I observe is that from (i) to (ii), the time qmm.open_qm() takes increase from 1/3 of qm.simulate() to about 5 times that. quam.open_qm() has an extra overhead of about 1.5 times qm.simulate(). The commit time goes from 2% of simulation time in (i) to about a similar time as the simulation in (ii). From (ii) to (iii), the commit time further increases to more than 10 times the simulation time! It means, the overhead is huge and growing in running the graph this way.

To Reproduce I've attached the folders containing the files need to profile the graphs (test_graph.py in Graph_i folder, and cal_graph.py in Graph_ii_iii). Graph_i.zip Graph_ii_iii.zip Graphs (ii) and (iii) share the same folder. Just change the database file in line 497 of module.py to be different quam = MyQuAMManager("database/test_quam.db"). The three screen shots above are from output.png (test.pstats) in Graph_i folder, test_qspec_output_clean.png (test_qspec_output_clean.pstats) and test_qspec_output.png (test_qspec_output.pstats) in Graph_ii_iii folder.

Expected behavior The qm.simulate() takes the majority of time in running all three graphs, i.e. there's not much overhead in using the entropy to run the experiments.

Desktop (please complete the following information):

Additional context This is of high priority to determine whether we'll use the newer version of entropy or not. With current overhead, it's not realistic to use InProcessParamStore. We also note that the commit will rewrite every history in the db file with json structure and thus could make it very inefficient.