OpenNMS / opennms-pagerduty-plugin

OpenNMS <-> PagerDuty
6 stars 4 forks source link

`event_type` should have a value of `resolve` when an outage clears (nodeRegainedService) #5

Closed GanimanSwift closed 4 years ago

GanimanSwift commented 4 years ago

We began testing the Pager Duty plugin and noticed the event_type seems incorrect when a nodeRegainedService event happens:

{
  "client": "OpenNMS",
  "client_url": "https://nms03.internal.opennms.com/opennms/alarm/detail.htm?id=864433",
  "description": "\n            The Reboot-Required outage on interface 172.20.42.28 has been\n            cleared. Service is restored.\n        ",
  "event_type": "trigger",
  "incident_key": "uei.opennms.org/nodes/nodeRegainedService::530:172.20.42.28:Reboot-Required",
  "service_key": "5e84cb63be324b8387f08ad49735f935"
}
j-white commented 4 years ago

If a nodeRegainedService is generated and the original nodeLostService has already been cleared/deleted, then a new alarm will be created with type=2 and severity=NORMAL.

We should also consider the alarm type when settings the PD event type: https://github.com/OpenNMS/opennms-pagerduty-plugin/blob/v0.1.2/plugin/src/main/java/org/opennms/integrations/pagerduty/PagerDutyForwarder.java#L201

However we'll need to guard against null values: https://issues.opennms.org/browse/NMS-12923

j-white commented 4 years ago

Fixed with https://github.com/OpenNMS/opennms-pagerduty-plugin/commit/454ab5c5c2a3dc61ecca6b91625cc29e87de6b1b

Needs to be running against system with patch for NMS-12923 to function properly.