Open faxm0dem opened 1 year ago
30s worth of those logs might be huge. What happens in this case is that syslog-ng would store all messages in a context that relate to the same key()
You could probably get away with smaller timeouts.
Some additional information: this might not be a very good fit for grouping-by() as it will form the "groups" in memory, meaning that all the messages that have the same key within 30s will be stored in memory until a trigger criteria is hit, which would never happen if you have a continuous stream of messages pushing out the timeout continuously.
grouping-by() was originally designed to process events that span multiple messages but are more limited in time, e.g. multiple messages from an SMTP transaction that we would like to turn into a single message.
What would be an easy solution here is to add a max-messages / max-time / max-context-size limit, which would trigger the aggregation and close the context, thereby freeing memory.
Would that help your use-case? How often do you want to emit these aggregations?
The way we solve this in riemann for instance is to not store all the context, but only the result of the (partial) consolidation function min/max/count/moving average. I understand this would involve quite some rewrite. I think what you propose would suit us well in this use-case : a max-memory-size or max-context-size. The downside would be that we would get more output, but then I guess we can chain the output through another grouping-by() as this won't affect min/max/avg.
syslog-ng
Version of syslog-ng
4.1.1
Platform
RHEL8
Issue
Failure
We're trying to use grouping-by() for a new use-case, that is to simply aggregate a certain number of logfiles. Whe running, syslog-ng eats up all ram and gets oom-killed by linux.
Steps to reproduce
Configuration
Input file
output logs