Rate-limiting of log messages

nyh commented 8 years ago

In some cases, once a certain problem occurs, it can occur thousands of times in quick succession, and flood the log files with identical (or at least similar) messages. The "solution" of lowering these messages log level to "trace" messages is not a real solution, and can hide a very real problem.

The better solution to the shower of messages isn't to hide them but to rate-limit them. Let's add a simple log rate-limiting object, which when when used, allows only a given number of messages to be printed per second, and further silenced messages are counted.

For example: WARN [shard 0] component - Some message WARN [shard 0] component - Some message WARN [shard 0] component - Some message WARN [shard 0] component - Some message WARN [shard 0] component - Some message WARN [shard 0] component - Some message (and 12345 similar messages skipped)

The per-second limit should be shared by all messages, from any shard, which use the same rate-limiting object.

6 months ago, I already sent a patch (to the scylladb mailing list, where the logger lived at that time) with an implementation. It should be rebased and Avi's suggestions implemented:

https://groups.google.com/d/msg/scylladb-dev/M8k0BQXSFRk/VgDRO_FKEQAJ https://groups.google.com/d/msg/scylladb-dev/M8k0BQXSFRk/22XhlPFKEQAJ

hakuch commented 7 years ago

If you don't have time to focus on this, @nyh, I could probably add it to my todo list.

tgrabiec commented 7 years ago

xemul commented 2 years ago

2214821836e3977cd4011fa43c2ceb8c82b0f0e5

scylladb / seastar

Rate-limiting of log messages #167