Sensetif / sensetif-app-plugin

The Grafana Application Plugin for the Sensetif platform.
Apache License 2.0
0 stars 0 forks source link

Alarms #78

Open niclash opened 11 months ago

niclash commented 11 months ago

This is an umbrella issue for "everything Alarms".

Tasks

Description of Alarms

Alarms are the monitoring solution to streaming data.

On each arrived data sample (TsDatapoint on Topics.TIMESERIES), it is possible to define 0..n condition scripts, identified by a ScriptUri, and that script may return an "alarm" condition, which is forwarded to the Topics.ALARMS topic.

For the condition script to create an alarm condition, it must return an object with the required members;

    let condition = {
        type: 'alarm',
        state: 'on',
        name: "SomeAlarmName",
        tsvalue: ts.value,
        message: 'Value exceeding 100.'
    };

ts.value is the value that arrived in the timeseries being monitored.

The condition script MAY also provide the properties map, that will be assigned to the Alarm.parameters field, and can contain arbitrary data. Typical use is for class, category, location, role, geo, city, address and much more.

AlarmStateHandler

Alarms is a finite state machine, with 7 states, 7 events (a.k.a transitions) and 7 triggers.

States

State Description
normal Normal state is indicating no problem and operating normally.
activated The condition has been triggered.
deactivated The condition has gone away, but the alarm has not been acknowledged.
reactivated The condition has been triggered again without an acknowledgement. I.e. the alarm state was in deactivated and got activated again.
acknowledged A user has acknowledged that the alarm has been observed. When the alarm condition goes away, the state returns to normal instead of deactivated
blocked A user has blocked the alarm from triggering events. This is used to temporarily disable alarms until the problem has been fixed.
disabled A user has permanently disabled the alarm. This will put its priority below normal, whereas blocked keeps the alarm in a higher priority than normal and deactivated

Events

Event Description
activation The alarm condition has become true and the alarm state was in normal state.
deactivation The alarm condition is no longer true, and the alarm state was either in activated or acknowledged state.
acknowledgment A user has acknowledged the alarm condition. This can only happen if the alarm was in activated, deactivated or reactivated state.
block A user has blocked the alarm. This event is not generated if the alarm state was blocked or disabled.
unblock A user has unblocked the alarm and it was in blocked state. The new state is always normal afterwards.
disable A user has disabled the alarm.
enable A user has enabled the alarm, if it was in disabled state. The new state is always normal.

Triggers

Triggers are handled internally in Sensetif, but mentioned here for completion.

Event Description
activate Alarm condition is present. Will only trigger if alarm is in normal or deactivated state.
deactivate Alarm condition is no longer present. Will only trigger if alarm is in activated or acknowledged state.
acknowledge User confirms that the alarm has been observed. Will only trigger if alarm is in activated or deactivated state.
block Temporarily block the alarm from any more events. Will trigger unless alarm is in blocked or disabled state.
unblock Removes the block of the alarm and put it back in normal state. Will only trigger if alarm is in blocked state.
disable Disables the alarm completely. Will trigger unless alarm is already in disabled state.
enable Enables the alarm. Will only trigger if the alarm is in disabled state.

Alarm Filters

{
  "type": "timerange",
  "name": "range1",
  "start": "00:00:00",
  "end": "23:59:59.999999999",
  "projects": null
}

ParameterFilter

True if the value of alarm.parameters[{name}] matches the `regExp.

If allowWhenMissing means that IF there is no parameter with the name, should that be considered true or not. If set to true, then this alarm filter will be true if the alarm doesn't have the parameter name.

{
  "type": "parameter",
  "name": "parameters1",
  "parameter": "location",
  "regExp": "sunnana",
  "allowWhenMissing": false
}

ScriptFilter

ScriptFilter needs to return true from the script to indicate that the AlarmFilterGroup should be used.

{
  "type": "script",
  "name": "script1",
  "scriptUri": "js://testproject/myscript?category=electric",
  "scriptExecutor": null
}

AlwaysFilter

Wil always return true.

{
  "type": "always",
  "name": "always"
}

AlarmFilterGroup

An AlarmFilterGroup is a set of mail recipients for alarm events, and which alarm events that set of recipients should receive.

All filters must be true for the group to receive a given alarm event.

{
  "group": {
    "name": "testgroup",
    "enabled": true,
    "recipients": [
      "niclas@hedhman.org",
      "niclas@bali.io"
    ]
  },
  "filters": [
    {
      "type": "parameter",
      "name": "electric",
      "parameter": "category",
      "regExp": "electricity",
      "allowWhenMissing": false
    },
    {
      "type": "parameter",
      "name": "region",
      "parameter": "region",
      "regExp": "Skåne",
      "allowWhenMissing": false
    }
  ]
}

Alarm Storage

Alarms are stored in the Cassandra KeyValues table.

Field Value Description
type "alarms" All rows has type set to "alarm".
name condition.name The name is taken from the condition.name field in the object returned by the condition script (see at the top).
value json The serialized AlarmImpl object.

Example of the value field;

{
  "organization": 12,
  "name": "TestAlarm1",
  "alarmClass": "A",
  "condition": false,
  "description": "Test description",
  "counter": 0,
  "state": {
    "name": "normal",
    "description": "Normal state indicates everything is fine.",
    "creationTime": "2023-11-24T11:36:56.107203597Z"
  },
  "labels": [
    "electric",
    "building",
    "unscheduled"
  ],
  "parameters": {
    "high-limit": "50",
    "hysteresis": "5",
    "low-limit": "20"
  }
}

AlarmCommands

Commands can be sent to the alarm system via Pulsar Topics.ALARM_COMMANDS.

The MessageKey must be "1:{orgId}:{user}". {user} is the logged in user that issued the command.

The Value on that Pulsar Topic is a serialized AlarmCommand value, with the following possible commands;

Command Meaning Example
acknowledge Acknowledge the alarm. { "command": "acknowledge", "args": { "alarm": "MyAlarmName" } }
block Block the alarm. { "command": "block", "args": { "alarm": "MyAlarmName" } }
unblock Unblock the alarm. { "command": "unblock", "args": { "alarm": "MyAlarmName" } }
disable Disable the alarm. { "command": "disable", "args": { "alarm": "MyAlarmName" } }
enable Enable the Alarm. { "command": "enable", "args": { "alarm": "MyAlarmName" } }
setclass Set the Alarm Class, one of A, B, C or D { "command": "setclass", "args": { "alarm": "MyAlarmName", "class": "C" } }
setdescription Set a human-readable description of the alarm. { "command": "setdescription", "args": { "alarm": "MyAlarmName", "description": "Some new description" } }
addlabel Adds a label to the alarm. { "command": "addlabel", "args": { "alarm": "MyAlarmName", "label": "research" } }
removelabel Removes an existing label from the alarm. { "command": "removelabel", "args": { "alarm": "MyAlarmName", "label": "research" } }
setparameter Adds a new parameter, or sets an existing parameter. { "command": "setparameter", "args": { "alarm": "MyAlarmName", "parameter": { "name": "category", "value": "plumbing"} } }
removeparameter Removes an existing parameter from the alarm. { "command": "removeparameter", "args": { "alarm": "MyAlarmName", "parameter": "category" } }
updatealarmfiltergroup Add/edit AlarmFilterGroup { "command": "updatealarmfiltergroup", "args": { "name": "MyAlarmFilterGroup", "group": { "name": "niclas", "enabled": true, "recipients": [ "niclas@hedhman.org", "niclas@bali.io" ] }, "filters": [ { "type": "always", "name": "always" } ] } }
removealarmfiltergroup Remove AlarmFilterGroup { "command": "removealarmfiltergroup", "args": { "name": "MyAlarmFilterGroup" } }