paritytech / substrate

Substrate: The platform for blockchain innovators
Apache License 2.0
8.39k stars 2.65k forks source link

Logging improvements #14760

Open nazar-pc opened 1 year ago

nazar-pc commented 1 year ago

This is something that bothered me for a long time. Defaulting logs to local time is confusing when debugging issues happening on machines all around the world.

This PR primarily introduces --use-utc-log-time CLI argument that reverts time format to what tracing-subscriber uses by default. I didn't want to change the default, but arguably UTC should have been the default all along.

While at it I also added support for NO_COLOR, though better approach might be to remove --disable-log-color altogether and use supports-color that automatically takes this environment variable into account (see https://github.com/tokio-rs/tracing/issues/2388 and https://github.com/tokio-rs/tracing/issues/2214).

Fixes https://github.com/paritytech/substrate/issues/11734

koute commented 1 year ago

Can you paste an example log before and after?

I don't remember, did we include the timezone in the timestamps? I think rather than adding an optional argument that almost no one will use (so it won't help with debugging in general, and it can even make it worse because now you won't know if someone's using local time or UTC) it'd probably be better to just have the timezone in the logs, and then you'd have the best of both worlds - people would see their local timezone by default as it is now, but you'd be able to easily transform the logs into UTC.

Also, I don't think we want to box the timer and use chrono for the timestamps. IIRC I introduced the FastLocalTime for a reason, and this can actually be a significant CPU hog if trace logs are enabled.

nazar-pc commented 1 year ago

Before:

2023-08-14 12:47:24 Substrate Node    
2023-08-14 12:47:24 ✌️  version 4.0.0-dev-ea403836c0f    

After:

2023-08-14T09:47:24.260478Z Substrate Node    
2023-08-14T09:47:24.260820Z ✌️  version 4.0.0-dev-ea403836c0f    

I think rather than adding an optional argument that almost no one will use (so it won't help with debugging in general, and it can even make it worse because now you won't know if someone's using local time or UTC) it'd probably be better to just have the timezone in the logs, and then you'd have the best of both worlds - people would see their local timezone by default as it is now, but you'd be able to easily transform the logs into UTC.

That is the point, I don't care what their timezone is as long as logs are always in UTC. And I certainly don't want to write scripts that extract timezones from logs and convert them all back to UTC so logs from different places and apps are comparable.

The plan was to make this flag default to true in out node software and we will likely even hide it an argument altogether, so that it is not customizable and is always UTC (just like other pieces of software users run alongside it).

Also in our case node is not just binary, but also a library, which when combined with other libraries makes logs look confusing with different timezones interleaved. We really want just UTC everywhere all the time.

Also, I don't think we want to box the timer and use chrono for the timestamps. IIRC I introduced the FastLocalTime for a reason, and this can actually be a significant CPU hog if trace logs are enabled.

Interesting, do you have a benchmark for this? It is hard to see how a single trait object would cause massive issues, but I guess everything is possible.

koute commented 1 year ago

That is the point, I don't care what their timezone is as long as logs are always in UTC. And I certainly don't want to write scripts that extract timezones from logs and convert them all back to UTC so logs from different places and apps are comparable.

The plan was to make this flag default to true in out node software and we will likely even hide it an argument altogether, so that it is not customizable and is always UTC (just like other pieces of software users run alongside it).

Also in our case node is not just binary, but also a library, which when combined with other libraries makes logs look confusing with different timezones interleaved. We really want just UTC everywhere all the time.

If we're going to fix the issue of "it's hard to correlate logs together because of different timestamp timezones" I'd rather we go all the way and fix it for everyone, not just for your particular use case. And having a default implicit timezone, whether it's local time or UTC, is not helpful if everyone doesn't use the same timezone.

I'd be fine with adding this flag to force UTC timezones on the condition that we just change the timestamp format to always include the timezone regardless of how the logger is configured. So then it'll be always unambiguous which timezone is used, while simultaneously letting you go UTC-only if you want.

Interesting, do you have a benchmark for this? It is hard to see how a single trait object would cause massive issues, but I guess everything is possible.

It was a long time ago so I don't remember the details, but I vaguely remember that it made a difference. (Not necessarily due to boxing but mostly due to formatting overhead, I think? But again, I don't remember the details.)

You should be able to modify FastLocalTime and add a always_use_utc parameter to it or something like that.

nazar-pc commented 1 year ago

If we're going to fix the issue of "it's hard to correlate logs together because of different timestamp timezones" I'd rather we go all the way and fix it for everyone, not just for your particular use case. And having a default implicit timezone, whether it's local time or UTC, is not helpful if everyone doesn't use the same timezone.

I just need it to be the same format as default tracing-subscriber. If that is the case for UTC, but we add timezone to local time - I can introduce the change. If logs from Substrate would add +00:00 at the end instead of Z, it would be better than right now, but still a bit ugly when interleaved with other logs.

I think the high-level issue is that logging is not really configurable from the outside right now. Someone did an attempt in that direction with builder_hook, but builder_hook is always set to no-op closure, so it essentially is not used, and it only allows to configure some aspects, but not everything.

You should be able to modify FastLocalTime and add a always_use_utc parameter to it or something like that.

I can do that, just want to make sure we are on the same page as to what the result should look here first.

koute commented 1 year ago

I just need it to be the same format as default tracing-subscriber. If that is the case for UTC, but we add timezone to local time - I can introduce the change. If logs from Substrate would add +00:00 at the end instead of Z, it would be better than right now, but still a bit ugly when interleaved with other logs.

Ugly or not, practicality trumps aesthetics. Being able to correlate logs regardless of the timezone and their source is a pretty useful feature. (And myself I often too found it painful when debugging 3rd-party logs, so I'd be happy to get it fixed.) As long as the timezone is included I'm fine with whatever default tracing-subscriber uses.

I can do that, just want to make sure we are on the same page as to what the result should look here first.

Sure. Let me just ping @paritytech/sdk-node in case anyone has any objections. If not we can just go ahead with always including the timezone in the timestamps and adding your flag to force it to emit timestamps in UTC.

bkchr commented 1 year ago

If not we can just go ahead with always including the timezone in the timestamps and adding your flag to force it to emit timestamps in UTC.

Fine by me, but can you give an example on how "timezone in the timestamps" looks like?

nazar-pc commented 1 year ago

Fine by me, but can you give an example on how "timezone in the timestamps" looks like?

I feel like what you're asking is to allow customization of the formatting for the date/time, which will be equivalent performance-wise to what this PR suggests for UTC, but apparently not as fast as purpose-built formatting would be. I'm fine with it, but I'm not running trace level logs very often, so maybe performance hit is prohibitive there, no idea.

koute commented 1 year ago

Fine by me, but can you give an example on how "timezone in the timestamps" looks like?

@bkchr This depends on the timestamp format (of which are several infinitely many, but few standard-ish), but customarily you just append a +\d to the end of the timestamp to indicate a timezone, for example, before:

2023-08-14 12:47:24

and after:

2023-08-14 12:47:24+0200

(where +0200 means that it's a plus two hours offset from UTC).