rs / zerolog

Zero Allocation JSON Logger
MIT License
10.4k stars 567 forks source link

Sampled logs used in multiple call point the sampling bucket #543

Open MUlan2004 opened 1 year ago

MUlan2004 commented 1 year ago
sampled := log.Sample(&zerolog.BasicSampler{N: 100})

sampled.Info().Msg("call point1")
...
sampled.Info().Msg("call point2")

In this example both call points will share the sampling bucket and for example call point1 is a lot more verbose, there is high probability call point2 never gets logged as it has less probability of getting sampled. One approach is to declare a separate logger for each call point but this gets unwieldy, especially when we can't declare them close to the call point, for we don't want to instantiate the logger inside the loop as sampler will get reset on each loop iteration.

another approach make sampler call point aware, and use runtime.Caller() to determine where it's being called from and instantiate separate bucket for separate call points, however runtime.Caller() is pretty expensive and we don't want to call it on each log ?

last approach is to use some sort of go:generate magic that instantiate unique vars for each call point, however it's pretty tricky to get this one right

So my question, has zerolog community come across this problem and how did you solve this ?

betamos commented 1 year ago

Ran into this issue as well.

there is high probability call point2 never gets logged as it has less probability of getting sampled

In my case, I have status updates that need to be sampled, and other events that don't need any sampling. Status updates will saturate the sampler so I never see an important event that comes right after (in my case - a close event).

One approach is to declare a separate logger for each call point but this gets unwieldy

This is a functional workaround, but a library author can't know what sampling rate is appropriate in the application.

For reference, zap uses the {level, message} tuple as a sampling key. This allows performant event-dependent sampling.

Unfortunately, zerolog's samplers can only access the level, not the message:

type Sampler interface {
    Sample(lvl Level)
}

Whether this limitation is fundamental or coincidental depends on the library architecture.

abhinav commented 10 months ago

Maintainers, will you accept a change that adds an optional upgrade to Sampler? Something like the following:

We'll add a new interface

type MessageSampler interface {
    Sampler

    SampleMessage(lvl Level, msg string) bool
}

If Logger.Sampler(..) is called with a type satisfying this interface, then Logger will plumb it down to Event, and that will query it when it has the message available.

joeycumines commented 10 months ago

There are a bunch of ways to skin this proverbial cat. It's an interesting challenge, with two parts:

  1. What to rate limit on / how to determine it
  2. How to apply the rate limit

It may not align with zerolog's "it's as fast as possible w/ as few allocations as possible" approach, but I personally tend to use a tuple (well, struct) like:

    // WARNING: Omits PC because it may differ for the same code location in
    // cases where inlining occurs, which is not desirable for rate limiting.
    callerForRateLimiting struct {
        Function string
        File     string
        Entry    uintptr
        Line     int
    }

^ requires knowledge of the parent callers, so would likely need to be baked into the core library. Using a tuple of level + message (like zap) would probably be a sufficient, easier to maintain (and faster) alternative.

For 2, I use an exhaustive (read: slower than alternatives, fairly complex) implementation, that tracks each discrete event (per "category" - whatever is implemented by 1.), within sliding windows, and applies 1-n rate limits within that window, where the allowed rates get progressively lower.

Taking that approach lets me only rate limit when necessary, and lets me log warning(s) if rate limits come into effect. It's quite possible not suitable for all use cases, though. Just adding my 2c - I don't generally use zerolog, myself, though I do like it overall.

It's an over-engineered, toy module, but here's what I use for 2: joeycumines/go-catrate (see godoc)

Details on what I mean by configuring multiple rate limits: https://github.com/joeycumines/go-catrate/blob/69bac4bc3b6943827fe8ac6858080b5900d7e9e3/rates.go#L18