META Improve Logging - Githubissues

moul commented 1 year ago

We've had quite a few chats about improving our logging system. Some things we're thinking about are:

Making it easier to pop in logging or debugging lines in the code, maybe by passing a nested logger to constructors.
Making logs more readable for us humans, like using colors, filters, and clearer formats.
Making sure logs work well for machines too. We're considering two separate outputs: a pretty, color-coded one for humans, and a simple, unfiltered, single-line JSON one for machines.

I'm going to leave this issue open for now. It's not a top priority, but we keep coming back to it, so it's handy to have it flagged with an issue number.

moul commented 1 year ago

@gfanton, once you come across this, could you summarize our solutions at Berty and Weshnet? I believe we had a solid approach there.

In essence, we implemented multiple outputs with separate configurations, encompassing a human-readable stderr, a comprehensive machine-friendly log file with log rotation, an in-memory ring facilitating API log consumption, and a dynamic filter selection enabled by configuration.

For further context, here are a few links: https://github.com/moul/zapfilter, https://github.com/berty/weshnet/tree/main/pkg/logutil.

n0izn0iz commented 1 year ago

Quick experience return: the filtering was a pain for me because the init on mobile was complex and it was hard to make sure the config was properly used IIRC, part of the problem was that there was multiple source of truth for the defaults and the config itself passed by many init routines in different parts of the project

ajnavarro commented 1 year ago

Good and fast library for logging I had been using on my recent projects: https://github.com/rs/zerolog

gfanton commented 1 year ago

I've also had a relatively bad experience with logs on GNO. The overflow of logs, coupled with the inability to disable or redirect them to a file, can be problematic, especially when some logs don't utilize the logger.

@moul In Berty, we use zap. However, given the diverse device/platform contexts (iOS, Android, Desktop, etc.) and multiple application layers (React Native -> iOS/Java -> Golang), we had to make a system that allows multiple stream outputs with distinct configurations for each (e.g., rotation logs without filtering to files, error-only logs for the iOS/Java bridge, an ios only driver using os_log with specific formatting, and so on). Additionally, we have a custom namespace filtering system that @moul created, called moul/zapfilter, which uses pattern matching for real-time filtering. Personally, I'm not a huge fan of this approach. I believe such filtering should be user-driven, either by piping it with another application or through a subsequent search using their preferred editor. As long as we can adjust the verbosity level with separate flags, I'm okay with using this type of filtering.

beside that, I think an essential first step would be to move (with a few exceptions) every fmt.[PF]rint() that prints to stdout/stderr to the actual logger. After that, we can maybe make a transition to a different logger.

Good and fast library for logging I had been using on my recent projects: rs/zerolog

I'm not opposed to zerolog. However, even though zap might be more complex to configure, I believe it's better suited for larger projects that demand complex configurations. Moreover, zap is more widely adopted and, as a result, has broader third-party app support.

zivkovicmilos commented 1 year ago

@moul

I'll pick up this issue since I've been meaning to tackle it for some time

thehowl commented 1 year ago

Some scattered ideas for various places for improvement relating to the GnoVM I've been meaning to write for a while.

Logging

Start using slog as our "official" way to do logging within the Gno codebase.
Have sane defaults for logging, but allow to change the usual things: log level, add coloring, granularly change the verbosity for each component... (I would think through env vars)
Centralise our logging with very specific rules, and an agreed-upon document to use as a source of truth for how to do effective logging in Gno.
Set up pprof as a feature of cmd/gno, and properly document ways to benchmark the gnovm using it.

Error handling

The crucial part here is that I think we need to start creating a "class of errors" generated by pkg/gnolang which are "mundane". This should eventually be almost the only error produced by the GnoVM, but the transition there will be long and in many places we will still need to have the full stacktraces to track down the actual error. The reason is that when programming in Gno, almost every error currently leads to a stacktrace.

Ideas:

Make the stacktraces more compact and useful (improve our signal-to-noise ratio)
- Elide all the frames up to the first panic call (removing stacktraces of any deferred functions)
- Put each frame on one line, compacted, ie:
- panic (go/src/runtime/panic.go:890)
- gnolang.ConvertTo (@/gnovm/pkg/gnolang/values_conversions.go:813)
- Put the preprocess stack and the value of the error LAST
- This is contrary to normal in Go, but I do think it's important that people see the proper error on the last line.
Create a MundaneError interface (not with that name, but with that idea), which does not print a stack trace, and where the cases where it shows up are properly tested and documented -- with the hopes of having this as our main erroring data type within the GnoVM
Make the stack trace and the error message two separate logs on slog, so users can disable/enable them at will.

Debugging tools

Make it so that var debug (pkg/gnolang/debug.gno) is a constant, always set to true, except with build tag nodebug.
- This should allow us to see some performance improvements in the GnoVM with the build tag enabled, and while I think we should still ship with debugging enabled, it should help to give us a better idea of when the if debug calls are actively slowing down code.
- At the same time, make debugging.Printf/Println/Errorf check for if debug (global constant); this way we can remove if debug in most other places, as unless the calls to debug.* have side-effects, they can be optimised out at compile-time.
Change the above debug system to use slog.
- Properly tag with slog each operation, and allow the user granular access to only enable certain logs, or disable the logs when loading a certain package (#1226), or enable timing information.
debug package.

moul commented 11 months ago

Based on a discussion with @piux2, we have identified a solution that addresses both the DevX and security requirements.

Here are the proposed changes:

Instead of using commands.IO, gnoland should use a logger. The main purpose of gnoland is to log information, not to interact with stdin or write raw stdout/stderr. Using a logger is more suitable and allows for clear unit tests without the need for monkey patching.
For other binaries, particularly CLI tools, we should assess whether the logger should be used more extensively in place of commands.IO. Alternatively, we can explore the option of commands.IO incorporating the logger or simply passing both more frequently.
Bonus: The gnoland binary should not have access to os.Stdin. We can use build tags to configure os.Stdin as a noop reader.

gnolang / gno

META Improve Logging #999

Logging

Error handling

Debugging tools