Improve shot cycle time in BLACS

philipstarkey commented 7 years ago

Original report (archived issue) by Philip Starkey (Bitbucket: pstarkey, GitHub: pstarkey).

There are probably many places we can improve the cycle time of a shot in BLACS.

One that I have found is the error state checking that occurs during the loop which monitors for the end of the experiment shot. Each iteration of the loop, it checks to see if any devices have restarted mid shot, and aborts appropriately. This check still occurs after the end of the shot has been detected (to ensure that a device restart was not missed between the last check and the end of the experiment).

However, the check for device restart currently scales poorly with the number of devices in use in the experiment. It currently iterates (see code) over each device tab and checks the state. However the state checking must be done in the main thread, which thus introduces the overhead associated with posting and event back to the main thread and waiting for the Qt event loop to process the event. And this happens once for each device in use.

We could instead request the state for all device tabs in one go, thus only posting a single event back to the main thread.

I would suggest introducing a new method to the queue manager which is

#!python
    @inmain_decorator(wait_for_return=True)
    def get_many_device_error_states(self,devices):
        return [device.error_message for name, device in devices.items()]

Then the check in the loop (see code link above) can become:

#!python
                        for error_state_message in self.get_many_device_error_states(devices_in_use):
                            if error_state_message:
                                restarted = True
                                break

philipstarkey commented 7 years ago

Original comment by Philip Starkey (Bitbucket: pstarkey, GitHub: pstarkey).

Edited issue description

philipstarkey commented 7 years ago

Original comment by Philip Starkey (Bitbucket: pstarkey, GitHub: pstarkey).

Edited issue description

philipstarkey commented 7 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).

I'm a little skeptical that there is much contention for the main thread or latency accessing it (I'm testing with just one DummyPseudoclock device, but I wrapped that check in a for i in range(50): and noticed no difference), but profiling will answer the question. At some point I'll profile the experiment loop for an actual experiment and we can see where most time is wasted and whether it's worth pursuing more optimisation.

Ah actually the PulseBlaster for example doing its frequent status checks (every ~100ms it looks like) during a run may be creating contention for the main thread - but if so the fix might be to have the PulseBlaster do 1 second worth of checks per call to the WorkerProcess rather than one check every 100 ms. For example the PineBlaster does this - its worker process does a blocking readline() from the serial connection waiting for a 'done' signal, and this times out in 1 second at which point the worker function returns. So each call to the worker process takes 1 second and so there's not much going on in the GUI process due to this. Could be a better model for the PulseBlaster.

But, Qt is surprisingly fast, and our use of threads is mostly cooperative rather than preemptive. So I'd see what profiling says before bothering to act.

labscript-suite-temp / blacs

Improve shot cycle time in BLACS #15