duct-framework / logger.timbre

Integrant methods for the Timbre logging library
3 stars 2 forks source link

Exceptions don't render as data #3

Open RickMoynihan opened 7 years ago

RickMoynihan commented 7 years ago

Not sure what you think about this, but currently (for me anyway) exceptions render like this, which seems against the logs as data idea you're trying to pursue:

(duct.logger/log (:duct.logger/timbre system) :info ::event (ex-info "foo" {}))
17-07-27 00:00:54 UnknownHost INFO [user:63] - :user/event
   clojure.core/eval   core.clj: 3105
                 ...                 
      user/eval20333  REPL Input     
clojure.core/ex-info   core.clj: 4617
clojure.lang.ExceptionInfo: foo

Presumably we want it to render as with Throwable->map, though presumably we might also want to pprint them?

weavejester commented 7 years ago

Yes, they probably want to render more how Clojure (from 1.8 onwards) renders them. I wasn't certain how to approach exceptions, so I punted the question of how to deal with them and just used Timbre's defaults.

Pretty printing is a sensible idea, but if we do then how do we delimit events in the log? (Not that the current logging behaviour accounts for that.)

Ideally I'd like logs to be fairly human readable, but also easily queried. Production logs are extensive enough that they'll almost always be queried in some fashion.

RickMoynihan commented 7 years ago

Pretty printing is a sensible idea, but if we do then how do we delimit events in the log? (Not that the current logging behaviour accounts for that.)

Picking up from this and the discussion on #1, I think the big problem here is that we're trying to have our cake and eat it. i.e. the current duct logs aren't a data based format at all. They make an attempt at consistency, but arguably so is any log formatter.

It seems ducts "data based" log format is really just:

<time> <host> <level> <ns-or-package:line> - <event-id> <data>?

vs timbres:

<time> <host> <level> <ns-or-package:line> - <data>*

Both still require parsing by regexes, and the only difference is in the complexity of handling 0 or more data items in timbres format vs one in yours.

Perhaps a pure EDN log entry like this would work better?

[#inst "17-07-27 00:00:54" :info ::event {:data :here}]

{:data :here} could also be an #error (exception). We could also toggle pprinting these log entries on/off with a config option.

Legacy java log messages could be formatted in to something like this:

[#inst "17-07-27 00:00:54" :info :duct.logger/raw-log-event [vargs-here]]

NOTE: this means (s/def ::data (s/or :map map? :legacy-vector vector?)

Duct could still provide its helper functions to target the :event-id data format, these helpers would probably need to set a ::duct-format value in to vargs so we can dispatch the duct specific renderer, rather than the legacy renderer.

weavejester commented 7 years ago

I agree that there's definitely some cake having/eating going on here.

Duct is primarily about facilitating understanding, both human and machine. I want to be able to ask questions like, "How many times did the web server on port 3000 restart?", and get back a sensible answer because the logs are in a machine-readable format.

But at the same time, I'm also conscious that there's a lot of support for legacy logging, particularly around Cloud-based logging solutions. However, now that I have a quick look around, I see most support JSON logs nowadays.

I don't know whether we want to default to JSON, but having an appender that translates the log into JSON seems like a sound idea. Or perhaps have appenders that work out-of-the-box for the most used solutions. In this respect, perhaps the default logging format isn't so bad; it's human readable, but any machine-based solution will be picky about the exact formatting (e.g. Loggly needs a top-level key called "timestamp" in the JSON log message), so maybe that output needs to be more bespoke.

Alternatively, we could output as just a string of edn, but I don't know of many tools that can parse that straight off.

RickMoynihan commented 7 years ago

I think the quest for greater consistency in logs is admirable, but it's not going to be standardised anytime soon.

I'm currently much more interested in just having the logs available (especially lower level logs) exposed by the internals of for example apache.httpclient, than I am in using logging for metrics.

At some point when I want the latter, I can always just add my own logs that conform to a certain format and report them; either to their own specific appender/file or grep for a certain namespace/log-level/whatever. Yes it's ugly and requires either adding or modifying logs to conform, but in my experience that's usually the best way to do a good job and work around these issues. It's worth remembering you can also log in multiple formats simultaneously.

Regarding making JSON logs the default, I'm not so sure... I don't find JSON human readable at all, and I think everyone will have differing requirements around log formatting. I think the default should be optimised for development and the duct onramp, rather than a 3rd party logging service.

The traditional java loggers are very flexible in this regard and in my mind a far better place to support alternate systems/loggers than through timbre appenders/middlewares. Timbre's quite cute, but it's feature sparse in comparison to logback or any of the other logging backends. For example I discovered yesterday that it doesn't even have built in support for filtering namespaces by log level yet.

Even though it was the work of 15 minutes to implement that, the lack of such a basic feature worries me. I've also been bitten by bugs in timbre a few years back, and replaced it for c.t.l which in my mind is more stable than timbre and worked flawlessly. I worry that timbre will be a moving target where as in logging I think I'd value stable, slow moving, well supported and fully featured.

Obviously the big feature timbre has is configuration by EDN; which is obviously super nice, but at the cost of the other things I value I'm not sure it's worth it.

The clojure ecosystem differs a bit from the java one in that clojure libraries in my experience tend not to incorporate log messages. I'm guessing this is because pure functions don't usually need them; but libraries that cause effects arguably do; and I don't think the community currently has a settled approach on it. This had me thinking that c.t.l is close to being the ideal logging API for libraries, as it doesn't really dictate any backend at all (unless you count SLF4j). Timbre however is a backend library, and currently I think unsuited to library level logging.

So I guess this brings us to the question on what is duct? A library or a framework, or both?

I think it's both, and that it might be better for modules to use c.t.l (perhaps through an intermediary like your log macro that can provide greater consistency in output). Then the framework part of duct can choose timbre as its default logging backend (but let you swap it out if you wish).

I'm not 100% sure this is the best approach, and haven't tried it in practice yet; but I was thinking that in the broader clojure sense it would make sense to advocate using c.t.l as a library dependency (if your library requires logging) because:

  1. choosing timbre is providing a logging backend which is an application concern.
  2. timbre is still moving (causing potential changes/version conflicts etc)
  3. logs sent through c.t.l can be handled by timbre

Incidentally concerning services like Loggly, they should be well supported already through logback etc.

Sorry to ramble but figured these thoughts might be relevant to the discussion.

weavejester commented 7 years ago

Duct doesn't depend directly on Timbre; you could just as easily create a c.t.l based logger and use that instead. The advantage of using a Logger protocol is that we don't need to care.

Regarding using c.t.l. directly, I'm afraid I don't see the point. Yes, it removes a level of indirection, but this is hardly a barrier to development. Further, c.t.l. is based around the somewhat archaic idea of logging strings, rather than data. Yes we can always parse out the data from the formatted string, but that seems like more of a pain than just providing a small protocol.

weavejester commented 7 years ago

I'm considering using logfmt as a compromise between human readability, standardisation and logs-as-data. I'll write more about it if it pans out.