apple / foundationdb

FoundationDB - the open source, distributed, transactional key-value store
https://apple.github.io/foundationdb/
Apache License 2.0
14.36k stars 1.3k forks source link

Document Severity-Levels #1516

Open mpilman opened 5 years ago

mpilman commented 5 years ago

The severity-levels are declared here:

https://github.com/apple/foundationdb/blob/648fc8ec7c3c9f55e6c552f7b281db3c4acc40b4/flow/Trace.h#L45

However, there doesn't seem to be any explanation what they actually mean. I usually like the approach of defining it through production reporting. For example this is how I usually think about it:

SevError: Should never happen or something happened that might impacts availability of cluster. Needs immediate attention (for example page people).
SevWarnAlways: Means something bad happens that might need some human attention (for example a failed disk) but the cluster should be able to survive for another 12 hours or so. Create a ticket.
SevWarn: Something that might cause the cluster to not run optimally happened but it won't be actionable in the short term.
SevInfo: Everything else that is useful to have in production
SevDebug: Everything that might be useful for testing but shouldn't be logged in production.

Now I assume that this way of thinking is not the same as other people think of it.

I think at the very least there should be a clear definition as comment of how these levels should be used. Otherwise a contributor won't be able to use these Traces in a consistent way.

ajbeamon commented 5 years ago

As you probably remember, there is some discussion here that talks about what the severity levels currently mean:

https://forums.foundationdb.org/t/why-is-a-sev-40-if-ilistener-accept-throws-an-error/663

I agree that these should be more formally documented. Roughly, they are currently something like:

SevError - a condition that leads to the running process terminating (or maybe just the role terminating, I think there's a small amount of variability in how these are used) SevWarnAlways - a non-fatal condition for the process that is nonetheless useful to be aware of (i.e. this severity is one that could be monitored) SevWarn/SevInfo - 2 priorities for events that we don't expect to be monitored but that are interesting for production SevDebug - Events that don't need to be logged in prod.

mpilman commented 5 years ago

I remember the discussion. I still don't agree with you here but this is not the point ;)

I would just like to have a comment in Trace.h that documents this. A forum-thread about severity of listeners running out of file descriptors is probably not the right place to have this documented.

ajbeamon commented 5 years ago

Sure, mainly I'm adding this information here for the benefit of whoever does the documenting.

TSP-wengle commented 2 years ago

Hi, guys FDB only prints logs above the log level SevInfo by default, how to open the log level SevDebug logs?

sfc-gh-mpilman commented 2 years ago

you can pass --knob-min-trace-severity=1 or set this knob though a network options (for client side logging). We currently don't officially support changing the log level. This is because setting the log level to anything larger than 10 will cause weird problems (this is imho a bug). But setting it to something smaller than 10 shouldn't cause issues (apart from a small performance regression due to more logging)