Closed philipstarkey closed 3 years ago
As an example of the performance improvements, I compiled an old experiment from Monash. Compilation time improved from about 0.37s to 0.27s (27% decrease) as measured by time.time()
statements (excluding imports).
Obviously disabling hg info is a pain, so I recommend testing on top of #72. You should see similar performance increases (ignoring the first compilation that produces the VCS cache, see details in #72).
I did some timing tests with this PR on my machine and some of the results are below (Click the arrows to expand). Interestingly it looks like the time-critical sections of the code are different for us than for the Monash sequence, so this PR doesn't speed up the total compilation time much. That said, make_timeseries()
did speed up by ~15%, expand_timeseries()
sped up by ~40%, and generate_clock()
sped up by ~25%. Since it seems that these speedups can make a noticeable difference for other users, I think it would be worth merging this PR. I haven't had a chance to look at the edits in detail yet though so I'll do that before merging.
For reference, I did all of these tests using PR #72 as suggested above. The results were measured with time.perf_counter()
and printed to console. The functions collect_change_times()
, generate_clock()
, and generate_code()
were each called only once. The functions make_timeseries()
and expand_timeseries()
are each called in their own for
loops, and the total duration of each of those for
loops was printed. Note that some of these functions are called by other functions, so the durations of some are included in the durations of others. The total compilation time was printed from runmanager.batch_compiler.BatchProcessor.compile()
, but the rest of the print()
statements were in labscript.py
.
Note: For my own reference, these are the timings for compiling a 575 ms 11-stage Raman cooling sequence with pre-release ramp/hold.
Just finished going through all the changes in detail and everything looks good! After reading through the code I'm convinced that all of the created arrays have the same data, datatype, and shape as before. As a sanity check, I also spent some time looking at a sequence compiled with/without the changes in this PR. The waveforms were identical everywhere I looked, even when zooming in quite far. In short, it looks to me like this PR increase performance without any changes in behavior, as desired. I think it's ready to be merged.
I've tested this against a very simple compilation on my test rig and have also found that it doesn't really speed up the overall compilation. It doesn't break anything either though, so it is likely fine. If desired, I could test on a more complicated experiment in the lab next week, but I don't think that will be necessary. I'd say go ahead and merge.
Agreed, thanks for taking a look at this
Makes sense that shorter experiments would not see the performance improvements given they mostly related to array length calculation and manipulation.
It's also difficult to compare absolute compilation times since we all have different machines. My tests were done on an i7-8700K CPU (my dev machine at home). That's generally more powerful than laptops or whatever our university IT department would provide. Writing to h5 file may also dominate if the PC doesn't have an SSD. Not sure what you're testing on but it's possible the Monash sequence would take ~1s to compile on your machines vs ~.3s on mine.
Or maybe it's something else.
Does suggest maybe we should create a standardised really complex example script we could all use for benchmarking!
Thanks for testing!
These small changes seem to improve compilation time significantly. Tests (done with saving of hg info disabled) indicate compilation time for a complex experiment is decreased somewhere between 25-33%.