labscript-suite / labscript

The 𝗹𝗮𝗯𝘀𝗰𝗿𝗶𝗽𝘁 library provides a translation from expressive Python code to low-level hardware instructions.
http://labscriptsuite.org
BSD 2-Clause "Simplified" License
9 stars 48 forks source link

Smarter dealing with errors, so not every error causes a stop. #32

Open philipstarkey opened 7 years ago

philipstarkey commented 7 years ago

Original report (archived issue) by Ian B. Spielman (Bitbucket: Ian Spielman, GitHub: ispielma).


Currently blacs stops on all error conditions. This is bad behavior. Our system requires that it be running constantly to stay in a stable "warm" configuration. So blacs should not stop submitting shots unless and "end of the world" bad event has occurred.

There are different degrees of badness. For example, in some cases a camera will miss an image for some reason. This type of error should just request blacs to ignore the error (or perhaps re-try the shot).

Blacs should switch to a "safe" script of too many errors accumulate.

philipstarkey commented 7 years ago

Original comment by Chris Billington (Bitbucket: cbillington, GitHub: chrisjbillington).


Although it's very hard to tell the difference between a very bad error and a not so bad error without the programmer anticipating each one and flagging it as such, we could make an option like "if encountering an error, restart the tab in question and retry the shot", and then something like "if there are still errors, go into repeat mode with a default shot" (the default shot likely obtained by a request to runmanager to please submit a shot with all the default values, once the "default values" feature is implemented there).

There could be increasingly aggressive recovery attempts each time one fails - first restart the offending tab. Then restart all tabs. Then do the "reset of hardware" functionality you mentioned in another feature request. Then even if there are errors during transition_to_static, keep running anyway to keep the experiment cycling. If there are persistent errors during transition_to_buffered, and restarting all tabs and doing hardware resets doesn't fix it, then there is no recourse left and BLACS will have to stop.

All the errors would have to be logged and the GUI prominently display that something went wrong earlier even if recovery was possible.

philipstarkey commented 7 years ago

Original comment by Philip Starkey (Bitbucket: pstarkey, GitHub: philipstarkey).


Just want to flag a couple of things:

philipstarkey commented 7 years ago

Original comment by Philip Starkey (Bitbucket: pstarkey, GitHub: philipstarkey).


Just want to flag a couple of things: