quarkusio / quarkus

Quarkus: Supersonic Subatomic Java.
https://quarkus.io
Apache License 2.0
13.58k stars 2.63k forks source link

Allow removal of stacktrace newlines in logging #6262

Open antoninarq opened 4 years ago

antoninarq commented 4 years ago

Description

When you want to format logs in JSON (to be sent to an EFK cluster), we can use this log format pattern :

quarkus.log.console.format={"@timestamp": "%d{yyyy-MM-dd'T'HH:mm:ss.SSS}", "level": "%p", "loggerName": "%C", "message": "%s", "stackTrace":"%e" }%n

However, exception stacktraces will be printed with newlines, which prevent the logs to be cleanly viewed as the log collector will collect each line separately.

Implementation ideas

A way to print the stack trace with newlines replaced by \nas text for stacktrace. That could be done with a new log pattern symbol like %e or with the support of operator like %replacefor logback (see this stackoverflow issue).

loicmathieu commented 4 years ago

@antoninarq the new logging-gelf extension will allow to send logs directly to EFK in the GELF format without the issue on stacktraces that you report.

The issue you report is not on the logging system but on the collection system (fluentd here) that is not capable or configured for multiline. If you want to configure fluentd for multiline, it's possible: https://docs.fluentd.org/parser/multiline

The workaround you suggested is ... a workaround :)

sfrank-mc commented 3 years ago

Same issue will happen with kubectl logs ... meaning this is not a "collection system" issue but a log formatting limitation. As a "Kubernetes-native Java framework" I would expect to be able to configure logging in a way that allows easier development.

logging-gelf doesn't support logging to the console - which makes this completely un-helpful for this issue - even as a workaround.

Right now the solution seems to be to either completely move out from using Quarkus or develop our own solutions to switch logging to use log4j which supports "%replace" - including access-log and other built in logs. Both are not easy but if Quarkus proves to be that opinionated it may prove to be the best direction we are left with.

Is there any other way?

loicmathieu commented 3 years ago

@sfrank-mc can you explain what is the issue with kubectl logs ? Quarkus is used a lot with k8s (I personnally use it with it) and nobody complains about Quarkus logs on a k8s deployment. Logs in kubectl will be the one send to the console, stacktraces will be with each frame on a new line and this is exactly what you have in a console when running locally so it makes sense to me.

sfrank-mc commented 3 years ago

Pertaining to this issue - if i use kubectl logs ... The log lines that contain \n will be displayed in multiple lines. It makes it harder to go through. Another example is if I use kubectl logs ... | grep "something" I will not get the complete row that has "something" in it. To overcome that I would like to be able to format my logs to NOT contain \n's but that is right now impossible with Quarkus. Of course other example may include security redaction of sensitive information and various other abilities which are currently not available.

sfrank-mc commented 3 years ago

Is there any plan to tackle this? We are now considering switching all logs out of Quarkus because of this issue. Before we go with this endeavor we want to understand if a solution would be provided by the framework in the near future.

rquinio commented 3 years ago

The stacktraces are only one case of multi-line logs, so this would only be a workaround at best.

Trying to list different approaches to solve the multi-line logs problem:

  1. Use a flat pattern in stdout, merge multi-line log events in fluentd via some correlation/regexp, and send the result to a centralized logging service (where flat pattern might need to be parsed again). It's the 12-factor way (https://12factor.net/logs) but it's clearly not optimal because some CPU will be wasted to perform this re-correlation.
  2. Use a structured format in stdout (ex: quarkus-logging-json), and send stream to centralized logging service. It hinders the readability when looking directly at console logs, so Devs and SREs may not like it.
  3. Use a structured format to send to a centralized logging service via HTTP (ex: quarkus-logging-gelf, quarkus-logging-splunk, etc.). Possibly also write logs to console with a flat pattern, for troubleshooting.
sfrank-mc commented 3 years ago

Thanks @rquinio I ended up using "quarkus-logging-json" - luckily CloudWatch and Kibana both can parse the JSON so that's helpful however like you said if for some reason the raw log need to be viewed it becomes not so good. As a Data Engineer I would think a better system design would be to process it as closer to the source as possible. I would imagine that a framework like Quarkus would want to allow for users to choose which logging framework to use rather than dictate that - just seems to me like a bad for an open source project. However if the logging solution is forced by the framework it looks quite bad to throw the issues created by that decision onto other parts of the system i.e. Fluentd ,centralised logging or forcing the usage of HTTP. For us - we have no control over Fluentd nor can we use HTTP as these are under the control of TechOps. That means that using Quarkus in a large company is really tying your hands. Of course that's just my humble opinion.

mkavalecz commented 2 years ago

Note: the same thing is an issue for AWS CloudWatch. We need to replace \n with \r for it to handle to messages correctly. I mean of course, you can use quarkus-logging-json, but if you prefer a regular log line, it won't work right now. Or is there any other solution I'm not aware of?

Felk commented 2 years ago

I am encountering a similar issue: trying to send the logs to fluentd using socket logging. The receiving end is not capable of handling multiline log messages, mostly due to fluentd's in_tcp not working with the multiline parser.

I cannot use quarkus-logging-json since it is not on a per-handler basis and makes the regular stdout logs human-unreadable too, which is not acceptable.

syslog suffers from the same multiline problems, and setting up a GELF input in fluentd unfortunately requires a 3rd-party plugin which is also unacceptable for my case.

This has put me in a dead end and think I might need to switch jboss-logging for something else, for example logback with the logstash-logback-encoder, which has first-class support for per-appender json-structured logging without newlines:

when a string is written as a JSON string value, any character not allowed in a JSON string will be escaped. For example, the newline character (ASCII 10) will be escaped as \n

emattheis commented 2 years ago

It seems to me that this could be easily solved if there were a way to configure the line separator used in JBoss LogManager. I've raised an issue:

https://issues.redhat.com/browse/LOGMGR-302

Personally, I'd love to see this for use in writing native lambdas with Quarkus. Using any of the alternative approaches for structured logging adds too much overhead.

Felk commented 1 year ago

Just for the record: I found a solution for my specific usecase. You can now register custom log filters, which get passed every LogRecord. I was then able to just manually send each log record's contents to fluentd using the (somewhat outdated) fluent-logger-java library. It's not quite a custom log handler, but it's the next best thing 😄