matrix-org / waterfall

A cascading stream forwarding unit for scalable, distributed voice and video conferencing over Matrix
Apache License 2.0
97 stars 5 forks source link

Implement efficient structured logging #50

Open daniel-abramov opened 1 year ago

daniel-abramov commented 1 year ago

We do use a framework that allows structured logging, however currently we use different formats to log stuff in other places. We must unify them in order to have structured Grafana-compatible logging.

daniel-abramov commented 1 year ago

FWIW: Also, I've noticed that with logrus log timestamps are not always going up. I.e. several go-routines logging via logrus may result in the following log entries in the file:

18:35:00
18:35:01
18:35:02
18:35:01
18:35:03
18:35:02

Which is not very convenient.

daniel-abramov commented 1 year ago

After investigating several logging frameworks and while working on https://github.com/matrix-org/waterfall/issues/141 it seems like the best option would be to use zerolog (it's faster and a bit more ergonomic than zap), although we would need to implement a hook for the event handling in order to produce OpenTelemetry span events when e.g. info logs are logged. For more info see updated description of https://github.com/matrix-org/waterfall/issues/141

speatzle commented 1 year ago

After investigating several logging frameworks and while working on #141 it seems like the best option would be to use zerolog (it's faster and a bit more ergonomic than zap)

Have you looked at slog? slog is the new "standard" golang logging package which has been proposed to be added to the standard library. One of its main goals is to be performant.

I have been using slog in my personal and company project's over the past couple of months and am very pleased with it.

You can find the design doc here.

For now it is available in golang.org/x/exp/slog but since the proposal was accepted by rsc about a day ago it should be in the standard library for the next go release.

daniel-abramov commented 1 year ago

There are two slog packages: one of which (the "official" one) is the package that you linked (I was not aware of that one as I validated another slog package).

The ideas and design goals for the slog look great! Perhaps we could use it instead of zerolog by implementing our own log handler (hopefully it won't be too cumbersome) in order to trigger OpenTelemetry span events when certain things are logged. There is a very young experimental integration with OpenTelemetry though. My only concern with slog is that it's a very young project that is not battle-tested yet and it's part of the experimental package. Not to mention that its primary development seems to be happening outside of GitHub atm (so not particularly easy to submit PRs and link them to our issues should we need to submit any changes).

Using something that is future-proof and could become a part of a standard library is a compelling argument though.