If you have high every/expire values, the aggregator will wait too long before producing metrics.
This is because the random splay added is based on the bucket size.
For example, this could take up to 600 seconds before producing metrics:
aggregate ^foo\..+\.bar$
every 300 seconds
expire after 301 seconds
compute sum write to foo.bar
send to main
stop
;
In most setups, a few seconds of splay should already be enough to avoid the "thundering herd of expirations" problem.
A simple solution would be to make the value configurable.
If you have high every/expire values, the aggregator will wait too long before producing metrics.
This is because the random splay added is based on the bucket size.
For example, this could take up to 600 seconds before producing metrics:
In most setups, a few seconds of splay should already be enough to avoid the "thundering herd of expirations" problem. A simple solution would be to make the value configurable.