Open andrewthad opened 3 years ago
My company is looking into ways to make our pipeline more efficient and "compression" via same-message aggregation is a major tool.
Being able to preserve and view aggregated message counts makes log data in ES more accurate.
As Stephen Brown points out, ES has a metadata field for correct computation of pre-aggregated data, _doc_count.
If we could aggregate messages upstream, mark them with their repetition amounts via ECS, and convert that to _doc_count
s on ingest, we'd get to compress our cake and eat it too.
As for naming, the event.*
field set seems appropriate. Probably just event.count
would do.
Summary
Add fields for counting repeated or related events. This is not a concrete proposal. I'm just dumping information here in the hopes that over time, others may come up with other example, and a pattern may show itself.
Motivation:
In several firewalls, proxies, and load balancers that I've worked with (different vendors too), there is a notion of "how many times did event X happen?" Here are a few examples:
CEF:0|A10| ... cnt=3327 src=192.0.2.33 dst=192.0.2.5 act=drop
. I believe that it this case, it's the number of ICMP packets from the same source to the same destination.countips
field means "Number of the IPS logs associated with the session", which is a little bit different. Maybe it should not use the same field as the others. It's also gotcountweb
andcountapp
and several other count fields. So, these are not really repeat counts like they are for PA and A10.To my recollection, the notion of suppressing repeats and providing a counter of how many times the same thing happened shows up in log aggregation software like rsyslog (open source) and logrythm (paid). It's been a while since I've worked with either of those tools though, so I cannot provide an example, and I could be mistaken.