AG-Schumann / Doberman

Distributed supervisory control, data acquisition, and monitoring software for small- to medium-size experiments
MIT License
0 stars 1 forks source link

Managing pipeline problems #214

Open adambrown1 opened 1 year ago

adambrown1 commented 1 year ago

I think we should add a way to handle problems which occur when pipelines are controlling critical systems. For example, if a pipeline controls LN2 refill and there is no new level value for a long time, we should probably stop filling to avoid disaster (and maybe issue an alarm). One idea might be to add the following functionality (but maybe someone has better ideas)

  1. An error state for pipelines. Amongst other things the pipeline goes into the state if a node raises and exception when executing
  2. A timeout as an option configuration option for source nodes, after which data is considered stale -> this puts the pipeline in an error state
  3. A default state for control nodes. They get put in this state whenever the pipeline is in error (and maybe also when the pipeline is stopped)
  4. A pipeline-controlled alarm node, to issue an alarm depending on a pipeline variable or whenever the pipeline is in error (configurable). This would also be useful for example to issue an alarm when the refilling has stopped working because we have run out of LN2, which the pipeline also detects. Could also have alarms when LN2 valves have been open for an unreasonable amount of time.
jarongrigat commented 1 year ago

I think it's a nice idea to put a fail-safe option in control nodes when it doesn't receive new data for too long. For the alarms, I'm not so sure. A pipeline goes 'stale' when it doesn't receive new data from one of the input sensors -> put an alarm on the sensor. You want to know if you ran out of nitrogen -> Put an alarm on the scale. We can discuss offline

adambrown1 commented 1 year ago

The fail-safe option should also activate when a node produces an exception, then I agree that your solution solves point 1 & 2. Should also be used when the pipeline is stopped -> solves point 3.

To point 4, this is fair enough for the dewar level, what about the alarm when a LN2 valve has been open for too long?