Control the error reporting verbosity with `--verbosity` flags

CMCDragonkai commented 2 years ago

Specification

The --format=human or --format=json now controls the STDERR reporting of exceptions. See: MatrixAI/Polykey#323.

Examples of the new error logging style:

# Single error (no cause)
# --format=human (default)
ErrorStatusLocked: Status is locked by another process
  exitCode      75
  timestamp     Mon May 23 2022 15:25:08 GMT+1000 (Australian Eastern Standard Time)
# --format=json
{"type":"ErrorStatusLocked","data":{"message":"","timestamp":"2022-05-23T05:00:32.049Z","data":{},"stack":"ErrorStatusLocked\n    at constructor_.start (/home/emma/Projects/js-polykey/src/status/Status.ts:53:13)\n    at async /home/emma/Projects/js-polykey/node_modules/@matrixai/async-init/src/StartStop.ts:56:24\n    at async withF (/home/emma/Projects/js-polykey/node_modules/@matrixai/resources/src/utils.ts:24:12)\n    at async Object.bootstrapState (/home/emma/Projects/js-polykey/src/bootstrap/utils.ts:73:5)\n    at async /home/emma/Projects/js-polykey/src/bin/bootstrap/CommandBootstrap.ts:24:31\n    at async CommandBootstrap.<anonymous> (/home/emma/Projects/js-polykey/src/bin/CommandPolykey.ts:75:7)\n    at async CommandPolykey.parseAsync (/home/emma/Projects/js-polykey/node_modules/commander/lib/command.js:923:5)\n    at async main (/home/emma/Projects/js-polykey/src/bin/polykey.ts:54:5)\n    at async /home/emma/Projects/js-polykey/src/bin/polykey.ts:103:5","exitCode":75}}

# Remote error with cause
# --format=human (default)
ErrorPolykeyRemote: Remote error from RPC call
  command       nodesPing
  nodeId        vasrqbpb1u51khi7ibg5j63ai3odomhj6js348k1ra5jffjuvrhsg
  host  127.0.0.1
  port  37595
  timestamp     Mon May 23 2022 15:25:21 GMT+1000 (Australian Eastern Standard Time)
  remote error: ErrorNodeGraphEmptyDatabase: NodeGraph database was empty
    exitCode    64
    timestamp   Mon May 23 2022 15:25:21 GMT+1000 (Australian Eastern Standard Time)
# --format=json
{"type":"ErrorPolykeyRemote","data":{"message":"","timestamp":"2022-05-23T04:56:47.661Z","data":{},"cause":{"type":"ErrorNodeGraphEmptyDatabase","data":{"message":"","timestamp":"2022-05-23T04:56:47.648Z","data":{},"stack":"ErrorNodeGraphEmptyDatabase\n    at constructor_.getClosestGlobalNodes (/home/emma/Projects/js-polykey/src/nodes/NodeConnectionManager.ts:456:13)\n    at async constructor_.findNode (/home/emma/Projects/js-polykey/src/nodes/NodeConnectionManager.ts:373:17)\n    at async NodeManager.pingNode (/home/emma/Projects/js-polykey/src/nodes/NodeManager.ts:58:7)\n    at async Object.nodesPing (/home/emma/Projects/js-polykey/src/client/service/nodesPing.ts:48:22)","exitCode":64}},"stack":"ErrorPolykeyRemote\n    at toError (/home/emma/Projects/js-polykey/src/grpc/utils/utils.ts:202:29)\n    at Object.callback (/home/emma/Projects/js-polykey/src/grpc/utils/utils.ts:380:17)\n    at Object.onReceiveStatus (/home/emma/Projects/js-polykey/node_modules/@grpc/grpc-js/src/client.ts:351:26)\n    at /home/emma/Projects/js-polykey/node_modules/@grpc/grpc-js/src/call-stream.ts:206:27\n    at Object.onReceiveStatus (/home/emma/Projects/js-polykey/src/grpc/utils/FlowCountInterceptor.ts:57:13)\n    at InterceptingListenerImpl.onReceiveStatus (/home/emma/Projects/js-polykey/node_modules/@grpc/grpc-js/src/call-stream.ts:202:19)\n    at /home/emma/Projects/js-polykey/node_modules/@grpc/grpc-js/src/call-stream.ts:206:27\n    at Object.onReceiveStatus (/home/emma/Projects/js-polykey/node_modules/@grpc/grpc-js/src/client-interceptors.ts:163:5)\n    at InterceptingListenerImpl.onReceiveStatus (/home/emma/Projects/js-polykey/node_modules/@grpc/grpc-js/src/call-stream.ts:202:19)\n    at Object.onReceiveStatus (/home/emma/Projects/js-polykey/node_modules/@grpc/grpc-js/src/client-interceptors.ts:462:34)","exitCode":64,"metadata":{"nodeId":{"type":"IdInternal","data":[87,55,165,229,97,241,67,72,200,242,92,11,51,13,82,30,27,139,70,102,159,6,68,80,59,81,102,247,207,223,220,121]},"host":"127.0.0.1","port":37595,"command":"nodesPing"}}}

Normal logger messages remain the same (i.e. are not affected by the format option). This can be addressed later in MatrixAI/js-logger#3.

However the verbosity of the report should be controlled with our --verbosity flag.

For --format=json because we expect this to be processed by machines, we can say that --verbose doesn't affect it, and instead all information is produced.

However for --format=human we can progressively show more information.

Right now without a verbosity flag, our logging level is set to show warnings and up. With the -v we get info logs, and -vv we get debug logs. This means we can think of having 3 levels of error reporting.

The default level without a -v should be assumed to be used by an end-user of the CLI program, who will probably never look at the source code of PK. Which means, any error we report should only really have the error description and error message. The error name can be useful, if they want to report the error to the dev who can identify the exception.

With -v, extra information about the error should be shown, this should probably mean all the properties except the stack property and this applies recursively to the cause chain.

Finally with -vv we should show the stack property as well.

This should apply to exceptions reported on the src/bin/polykey.ts and src/bin/polykey-agent.ts. In the case of PolykeyAgent, this depends on pk agent start -vv, verbosity flags passed from pk agent start subcommand.

Additional context

MatrixAI/Polykey#323
Should verbosity flags also control json format? I would argue not atm, but I could be convinced otherwise. Perhaps some additional work on structured logging could help decide https://github.com/MatrixAI/js-logger/issues/3

Tasks

[ ] - Parameterise the current error formatter in outputFormatter. Use the verbosity level. It should be a number.
[ ] - Apply the verbosity flags to CLI subcommands
[ ] - Apply the verbosity flags to pk agent start
[ ] - Add tests to ensure that verbosity flags are being respected
[ ] - Document the output formatter, and what default means, and -v means and -vv means.

CMCDragonkai commented 2 years ago

Default verbosity should show us at least 2 things:

The error class name - this helps devs reference the error
Description
Message (if it exists)
Optionally important metadata - properties like data and others

It is essential that the metadata should be mostly optional. That is, we should endeavour to describe what the error is in the message with all the relevant properties embedded in the message.

The exact output can be something like:

<ErrorName>: <Description> - <Message>
  prop: value

Ideally in most cases the prop: value is not shown at all. Such as in the case of ErrorPolykeyRemote, the message property should have all the relevant information stored in it. That way we could see something like:

ErrorPolykeyRemote: Remote error from RPC call - Pinging remote node `vasrqbpb1u51khi7ibg5j63ai3odomhj6js348k1ra5jffjuvrhsg@127.0.0.1:37595` with `nodesPing` failed

I think having the error class name up front might be bad UX for non-technical users. Most examples I see in the wild has the "code" later.

Remote error from RPC call - Pinging remote node vasrqbpb1u51khi7ibg5j63ai3odomhj6js348k1ra5jffjuvrhsg@127.0.0.1:37595 with `nodesPing` failed - #ErrorPolykeyRemote

// or

Remote error from RPC call: Pinging remote node vasrqbpb1u51khi7ibg5j63ai3odomhj6js348k1ra5jffjuvrhsg@127.0.0.1:37595 with `nodesPing` failed - #ErrorPolykeyRemote

Note sure yet. It's probably a good idea to have a short version of the node id too.

Commands will need to interpret why it is failing, and provide a custom message (think of this like "localisation") for that specific problem, and then a default message in case it's not one of the expected errors. I believe we deal with this just by creating another exception, and then rethrow it with exception override pattern.

CMCDragonkai commented 2 years ago

Structured logging is available now, we can make use of it to achieve this. As well as potentially incorporating https://github.com/davidmarkclements/fast-redact for redacting structured information.

CMCDragonkai commented 2 years ago

JSON structured logging is now being integrated with MatrixAI/Polykey#421.

Atm, still no verbosity control is applied to exception/error reporting.

At the same time, the reporting of errors is different from the logger. The logger isn't used for reporting exception error information to STDERR.

We may look into standardising the reporting of both errors and logs, so that reporting of root exceptions goes through the same logger system. Right now non-root exceptions are being reported with logger.error, however it doesn't do any special handling.

At the same time, I want to investigate tracing https://github.com/MatrixAI/js-logger/issues/15, if tracing wraps around logger, then our child loggers are replaced with "child spans". These spans then may add additional context around exceptions that get reported.

So while exceptions themselves have a low-level stack trace regarding the function call graph, our tracing spans provides a more "global" overview from object contexts to distributed system RPC.

So any unification of error logging and structured logging and human readable logging should be designed together.

CMCDragonkai commented 1 year ago

Moving to Polykey-CLI.

CMCDragonkai commented 9 months ago

The Logger supports dynamic setFilter.

Add a new command line option to the PK CLI so that one can provide a filter expression (regexp) that will be set to the root logger to control it.

Note that this can affect 2 things:

The client program
- Applies only to the CLI instance
The agent program
- At the bootup - pk bootstrap and pk agent start (but you have to separate filters for the agent, vs the main program)

If you want to dynamically control the filter on the agent program - that can be done, but it requires a client service handler that has access to pkAgent.logger and set the filter there. Remember that's stateful. You could also dynamically set the level too.

CMCDragonkai commented 2 weeks ago

Solving the verbosity problem also helps obviate #107. See the comment there: https://github.com/MatrixAI/Polykey-CLI/issues/107#issuecomment-2304453813.

I'm also kind of thinking inverted messages is not a good idea because it won't match with when the verbosity increases.

CMCDragonkai commented 1 week ago

There are primarily 3 kinds of verbosity here:

Verbosity of Exception/Error Reporting that goes to STDERR

Verbosity of Log Messages that goes to STDERR

Verbosity of Status Updates that also goes to STDERR

Previously I understood 2. and 3. to be the same thing... but they are slightly different. the PK CLI outside of the agent start operation doesn't tend to use logger at all. But actually if it did use logger, its messages would be classified under 3.

Verbosity should not affect STDOUT output/result of a command. That should also be based on the output formatter.

165

MatrixAI / Polykey-CLI