sensu / sensu-go

Simple. Scalable. Multi-cloud monitoring.
https://sensu.io
MIT License
1.02k stars 176 forks source link

Feature Request: Drop duplicate events #2686

Open roganartu opened 5 years ago

roganartu commented 5 years ago

It would be useful if it was possible to send two identical events to Sensu and have it only process one of them, dropping the other.

Expected Behavior

1) An identical event is submitted to Sensu via the API twice 2) Sensu receives both events 3) Sensu processes one as normal, and drops the other

Current Behavior

Sensu currently pipelines both events, causing duplication of everything.

Possible Solution

Maybe add a deduplication key to the Event object. Pagerduty has a similar notion with their dedup_key: https://v2.developer.pagerduty.com/docs/events-api-v2#alert-de-duplication

The dedup_key is slightly different though, and should perhaps be named grouping_key instead since that's more what it does. I'm not proposing adding a key to the Event object that incoming events are grouped by (since that already exists https://github.com/sensu/sensu-go/blob/master/backend/eventd/eventd.go#L143) but am suggesting a true deduplication key where two events sharing a key are considered to be identical by Sensu in this regard.

Context

We are running a service that ingests logs from many sources, processes them, and emits Sensu events occasionally based on their content. Making this HA without duplicating events to Sensu requires some kind of quorum mechanism be implemented so that only one node ever emits Sensu events. These nodes all emit identical event data given an identical log entry, and so could instead derive a deduplication key based off the log data (eg: a hash of the raw log contents) and use Sensu's quorum instead.

This kind of functionality is also useful for non-distributed error recovery use cases. eg: imagine a script that processes some data and emits Sensu events crashing at some point between submitting the Sensu event and permanently marking the data as processed. It is far simpler from a user's perspective to make that script just reprocess and resubmits the same event instead of working in some (probably fragile) logic to check the Sensu API for an existing event before submitting every time.

calebhailey commented 4 years ago

@roganartu sorry for the long radio silence here. I hope you're well! Is this still a relevant issue for you? It's certainly an interesting one!

When I read "deduplication" I immediately think of how to solve this via our filtering system, either via an existing capability, or by adding some built-in deduplication capabilities.

Have you considered using an occurrence filter for potentially duplicate events? I ask because I wonder to what degree a "duplicate" event is indistinguishable from two very-similar-but-actually-unique events that are sent in close succession to one another. It seems that these are at least related challenges. Rather than dedup the events, would configuring Sensu to only alert on the first (or Nth) occurrence of an event, and then again every N minutes thereafter help you achieve the desired outcome? i.e. "only alert me the first time I see an event, and then don't alert me again for another 30 minutes". Thoughts?

If that's not a good alternative to deduplication as you've described above, I wonder if you could comment on whether duplicate events would also include a timestamp or if you'd consider two otherwise matching events without a timestamp as duplicates (the latter resulting in the backend setting unique timestamps upon receipt)? Please advise.

Cheers!

roganartu commented 4 years ago

Our service that prompted this request has changed to send events via ES (where the service I mentioned in the Context section now sends some data to ES, and an active check queries ES and submits Sensu events accordingly). This was done due to the need for component support that we've discussed before, but coincidentally also provided the desired deduplication via logstash ingestion: https://www.elastic.co/blog/logstash-lessons-handling-duplicates

To clarify, when I used the term duplicate above I meant down to every single property of the event, including timestamps. My vision for this when I proposed it was that only events that were submitted with a non-empty dedup_key field would be considered for dedup, and the given value would simply be compared against an LRU cache of seen dedup keys with no other backend intelligence applied. This is only possible when the external service that is submitting events is able to derive some unique hash (for a concrete example, this was originally for the first iteration of our network device syslog ingester, and so the hash would've trivially been the SHA of the raw syslog message the event was derived from).

I accept that this is a rather advanced use feature, and may not make sense to implement anytime soon, especially since we don't have a need for it anymore.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.