vectordotdev / vector

A high-performance observability data pipeline.
https://vector.dev
Mozilla Public License 2.0
17.85k stars 1.58k forks source link

Consistently rate limit internal logs #14424

Open jszwedko opened 2 years ago

jszwedko commented 2 years ago

Follow-up to https://github.com/vectordotdev/vector/issues/13185

While implementing https://github.com/vectordotdev/vector/issues/13185 it was clear there is still some ambiguities about how we do rate limiting of internal logs: which levels should have which rate limits.

Current state:

InternalEvent:

log():

We think this would be better as:

InternalEvent:

log():

Open questions:

Suggested behavior of the log() function:

spencergilbert commented 2 years ago
  • Should the rate limit be disabled if log verbosity is DEBUG or higher?

If it's globally applied, would it be easier to just set the LOG_LIMIT_SEC=0?

  • Should we deprecate rate_limit_secs for VRL's log and just use the configured log rate limit (defaulting to 10)

The log() function is regularly used to expose issues with parsing in remap transforms. I feel like having that separately controlled vs Vector's events is valuable.

jszwedko commented 2 years ago
  • Should the rate limit be disabled if log verbosity is DEBUG or higher?

If it's globally applied, would it be easier to just set the LOG_LIMIT_SEC=0?

That's my leaning too: that we just encourage people to disable the rate limit, in addition to increasing verbosity, when they want to see more logs.

  • Should we deprecate rate_limit_secs for VRL's log and just use the configured log rate limit (defaulting to 10)

The log() function is regularly used to expose issues with parsing in remap transforms. I feel like having that separately controlled vs Vector's events is valuable.

Yeah, that's a good point and argument for keeping the parameter 👍

spencergilbert commented 2 years ago

Realistically I don't think we need to keep log()'s default as 1 for compatibility, it shouldn't be a big deal changing that - but it's also not really painful to keep it as-is.

arshiyasolei commented 2 years ago

Currently our InternalEvents do default to 10 when the rate limit flag is set to true at callsite. Regarding the log function:

If new CLI log rate limit option is set, use this as the default instead of 1

One way of implementing this is handling it within the tracing-limit crate, but that will increase code coupling.