ISISComputingGroup / IBEX

Top level repository for IBEX stories
5 stars 2 forks source link

[EPIC]: alerting on block ranges #4870

Open Tom-Willemsen opened 5 years ago

Tom-Willemsen commented 5 years ago

As an instrument scientist I would like to be able to set up my own customized alerts for blocks being out of range.

Note: Some of the server logic for this may have been implemented in https://github.com/ISISComputingGroup/IBEX/issues/3949 , although that ticket is not fully complete at the time of writing.

This ticket is to provide an interface in the GUI, probably similar to the existing run control interface, so that scientists can configure this themselves without needing to do manual caput commands from a terminal. We should also consider whether, like run control, these settings can be set both as part of the configuration and then temporarily overridden.

Requested by MERLIN specifically as they want to be able to configure the alert of their collimator without needing to ask us to reconfigure nagios. It was, apparently, also a feature of SECI.

See also https://github.com/ISISComputingGroup/IBEX/issues/3811

EPICS FOR

Acceptance

FreddieAkeroyd commented 4 years ago

The alert mechanism would work in very much the same way as run control, and the underlying structure is already in place from #3949. It is possible to trigger alerts when values go out of range currently, but it requires setting various variables in the background. This ticket adds a GUI to make this easier to configure:

  1. Add a “view alert setting” option to the GUI, which would look very similar to the current “view run control settings”. Like run control, alerts have low/high/enable settings.
  2. Add an extra entry to the “edit/configure block” section of editing a configuration, this would look very similar to the existing run control section of this dialog but apply to alerts.
  3. Add a separate dialog box to allow email and phone numbers to be entered of who to notify. Probably best to have a list of names with an “enable” check box so that details do not need to be re-entered, just a box checked or unchecked. It may be we need to make this dialogue box visible in manager mode only, I am not sure how often users are added. To comply with GDPR we may have to make sure we do not retain details longer than necessary, so if users can be added then we may need to make them expire e.g. when you add a name an optional expire date can be given. The users details could be stored in a new MySQL table on the instrument, currently however they would need to written to a PVs for the alerts to work so they could be just be autosaved.

I am proposing we add a couple of extra options to alert/run control logic, these would apply to both run control and alerts: • number of seconds a value must be out of range before out of range alert/waiting is triggered • number of seconds a value must be back in range for in range alert/running to be triggered

These values would help even out fluctuations, but also make it easy to add an alerts like “instrument has been in state YYY for ZZZ seconds” without us needing to add a special case. We would add well known special cases, like instrument has been waiting for YYY seconds, to the page where phone numbers/emails are configured.

When an alert is triggered, or things come back into range, this will be recorded at the server side, but it may be useful to log the fact an alert has been triggered (but not to whom) client side somewhere too (for if somebody is testing the mechanism)

ChrisM-S commented 4 years ago

It might be possible to associate the user alert information with a (single?) fixed location in experiment details. When an experiment changes and these are reloaded for the next group, this would then also change and the users would also automatically be refreshed. Is it essential for alerts to be otherwise linked with a specific user?

FreddieAkeroyd commented 4 years ago

I am not sure now how much users will want to have alerts, in the first instance we are going to have scientists only to mirror SECI. As there will be no on-site users in September alerts to users may be even less needed. Putting the details on the user page and clearing with rb number change is certainly one option.