Uninett / Argus

Argus is an alert aggregator for monitoring systems
GNU General Public License v3.0
18 stars 13 forks source link

Support delayed notifications #166

Open lunkwill42 opened 4 years ago

lunkwill42 commented 4 years ago

Argus needs at minimum two types of delayed notifications (this definitely post-1.0)

It should be possible to configure a subscription that will queue all notifications generated in a timeslot, and send these in bulk, when either

  1. The next timeslot begins.
  2. OR at a specific time of the day/week.

In addition, this type of subscription needs an option to drop notifications related to any incident that was closed/resolved before the notification is actually sent (i.e. the "Tell me everything that went wrong last night while I was sleeping, but only the things that are still issues" function)

hmpf commented 4 years ago

Idea: NotificationQueue model, with foreign key to Event. (Potential problem with stateless incidents, they only have start events.)

class NotificationQueue(models.Model):
    event = models.ForeignKey(Event, .., related_name='sent', ..)
    received = models.DatetImeField(default=now, ..)
    profiles = models.ManyToManyField(NotificationProfile, .., related_name="queues")

The queue will only see adds and deletes. On handover to notification sender, checks if incident is closed, if so: deletes from queue immediately.

class NotificationStatus(models.Model):
    event = models.OneToOneKey(Event, .., related_name='sent', ..)
    received = models.DatetImeField(blank=True, null=True, ..)
    sent = models.DatetImeField(blank=True, null=True, ..)
    profiles = models.ManyToManyField(NotificationProfile, .., related_name="queues")

After a notification is handed over to the notification sender, NotificationStatus is updated and the entry in NotificationQueue is deleted. (In one transaction.)

Both done in another process.

katsel commented 3 years ago

From today's CNaaS-sync-meeting:

katsel commented 3 years ago

May also touch on #209. After "pausing" a notification profile, the user may want a digest of what happened.

Also, I am not sure how common recurring incidents are (where all attributes are equal except the timestamp), but we may want to collapse these incidents into one, just adding a number and time frame.

So instead of

9:00 Incident X 9:07 Incident X 9:10 Incident X ...

user gets

Incident X happened 20 times between 9 and 12.

lunkwill42 commented 2 years ago

This is relevant as a sub-part of #121

hmpf commented 2 years ago

Depends on #359