Closed jodh-intel closed 6 years ago
@jodh-intel PR looks good, and I agree this should be in a new subpackage of virtcontainers. This way, this could be reused accross repos.
Hi @sboeuf - regarding the separate package, what about Kata? If we agree to adopt "enriched errors", we'd want them used everywhere so wouldn't it be simpler to create a completely separate package (https://github.com/kata-containers/errors or https://github.com/kata-containers/utils or something) that all components could vendor to avoid vendoring (a sub-package of) virtcontainers?
Well given the fact that virtcontainers might very likely move under kata-containers/runtime, there is no need to move this to its own repo.
A summary of current thinking...
@markdryan pointed me at https://github.com/pkg/errors. This looks interesting, but although it improves the quality of the errors by adding a stack trace...
error
requires that particular format verbs are used to show the stack trace.I'm after a solution that doesn't change the error interface and which generates "structured" errors.
I have an almost complete implementation whose API is:
package errors
// SetLogger sets the error logger to be used when generating error
// objects.
func SetLogger(logger *logrus.Entry)
// New is equivalent to errors.New().
func New(msg string) error
// Errorf is equivalent to fmt.Errorf().
func Errorf(format string, args ...interface{}) error
// Error takes a "normal" error and converts it into an enriched error
func Error(err error) error
This works in most scenarios but gets tripped up due to the flexible nature of logrus
. Specifically, the following idiom -- where a more specialised logger is created (with extra fields) -- is problematic:
// "register" logger with the errors package
errors.SetLogger(virtLog)
// ...
logger := virtLog.WithField("foo", "bar")
err := doSomething()
if err != nil {
// log the error
logger.WithError(err).Error("doSomething() failed")
// return an enriched error.
//
// BUG: the Error() function is still using the original logger (virtLog)
// which doesn't have the "foo" field.
return errors.Error(err)
}
The ways to resolve this issue (atleast the ones I came up with) aren't particularly palatable:
Allow errors.Error()
, errors.Errorf()
and errors.New()
to accept an optional logger.
It would work, but be error prone to use as easy to forget to pass it.
Change the API so that SetLogger()
becomes something like:
func SetLogger(f func() *logrus.Entry)
That way, the code could be reworked so that direct access to virtLog
would be replaced by a call to a function that returned the logger. But again, you'd need to remember to call it so it would be error-prone.
I had a further think and came up with the following:
// LogError takes a logger which is assumed to contain an error and returns a
// new structured error which includes details of the error call site.
func LogError(entry *logrus.Entry) error
With this, we could then do magic like:
func f() error {
logger := virtLog.WithField("foo", "bar")
err := doSomething()
if err != nil {
// this logs
return LogError(logger.WithError(err))
}
value := doSomethingElse()
if value == "" {
return LogError(logger.Error("doSomethingElse() failed"))
}
return nil
}
The advantages of this approach:
LogError()
can either just return the error or it could also log it normally then return it as an error if desired..WithError()
with the provided logger, atleast the call site information tells you where your bug is!An example of that API:
origErr := errors.New("hello world")
err := LogError(logger.WithError(origErr))
fmt.Println(err)
which generates an error something like:
time="2018-03-06T17:29:13.06774561Z" level=error error="hello world" error-file=/home/james/src/go/test_logrus_as_error2.go error-function=main.testLogError error-line=156 error-time="2018-03-06T17:29:13.067665955Z" name=test_logrus_as_error2 pid=25206 source=error-logger
... or re-formatted with the log-parser:
- count: 1
timedelta: 0
filename: /home/james/tmp/kata-logs/error-logger.txt
line: 1
time: !!timestamp 2018-03-06T17:29:13.06774561Z
pid: 25206
level: error
msg: ""
source: error-logger
name: test_logrus_as_error2
data:
error: hello world
error-file: /home/james/src/go/test_logrus_as_error2.go
error-function: main.testLogError
error-line: "156"
error-time: "2018-03-06T17:29:13.067665955Z"
Any thoughts on the above @sameo, @sboeuf, @grahamwhaley?
Most of that sounds sane to me, I think.
mostly unrelated - query - is the !!
deliberate in the timestamps:
time: !!timestamp 2018-03-06T17:29:13.06774561Z
and if so, what does it mean, and why do they not also appear in the error time?
Yes, LogError()
is logrus-specific. If we ever wanted to move to another logger, I'm assuming we'd find a way to rework that function in similar ways to extract the log fields.
Ignore those timestamps - you get the weird prefix when running kata-log-parser --output-format yaml ...
. See: http://yaml.org/type/timestamp.html.
@jodh-intel I am all for simplicity ! And your simpler API idea
looks good to me. Especially for logging, it's important not to be confused with too many options.
@jodh-intel Need to be closed and moved to https://github.com/kata-containers/runtime/virtcontainers
Context
All components of both Clear Containers and Kata Containers now produce structured logs. This is extremely useful as the log fields provide a lot of context on key stages in the runtime flows.
Current virtcontainers error example
However, errors currently suffer as they are wholly "free-format".
For example, here's an example of a structured log message logged by
cc-runtime
:Or formatted with the log-parser on https://github.com/kata-containers/tests/pull/98 for easier reading:
Notes:
msg
is the value passed tofmt.Errorf()
.source=runtime
is actually wrong - this error was generated by virtcontainers, but it was the runtime that caught the error and logged it (but how would a reader know this?)Error based on this PR
This PR is just an idea for how we could improve our error handling. It adds a new implementation of
Errorf()
so that rather than the above, with this PR you'd get the following:... and here's the log-parser reformatted version of that line:
Notes:
time
) and the other showing the time the error was generated (error-time
).source
now correctly showsvirtcontainers
.Caveats
This PR is just an example. To be generally useful, we'd need this
Errorf()
function (and an equivalent toerrors.New()
) to be available to all components of Clear Containers and Kata Containers. That implies we'd need a new package for this functionality in a separate repo that we'd "vendor in" to all the components.The other issue is how we handle the logger itself. Ideally, we'd find a way to register the logger with the new package and then have
Errorf()
use that logger object. But worst case, we'd need to change all thefmt.Errorf()
anderrors.New()
calls to pass in the logger as an extra arg./cc @sameo, @sboeuf, @grahamwhaley.