Closed wyaeld closed 6 years ago
Our goals are to have Stackdriver Logging, and also the Stackdriver error reporting integrated into all services. I haven't started on the Error reporting yet.
The tracing mechanisms in Spring Cloud are not of much use to us because we are using Linkerd in front of the service, and its already doing all the tracing we require.
Thank you for the detailed explanation of the setup you ended up with. This will be very useful as we start work on #154.
@meltsufin you're welcome.
I'm still quite new to spring. Any chance you can point me where to look to setup an error handler/interceptor so I can get the Stackdriver error reporting to work?
We don't really have any support for it in the project at the moment. You would basically have to use the Error Reporting client lib.
AFAIK Stackdriver Logging will automatically create error reports when it detects a stack trace in the log stream. However, this only works for non-JSON formatted logs.
We should probably provide some kind of error handler in this project to make it easier to use Stackdriver Error reporting. Would you mind submitting an issue for that? Thanks!
I ran some testing trying to see if Error Reports would be created. Didn't seem to work at all when we were using default spring logging. Maybe that only acts if they are correctly mapped to SEVERE.
I see this note:
Note: Error logs written to stderr are processed automatically by Stackdriver Error Reporting, without needing to use the Stackdriver Error Reporting package for Java directly.
And the GKE logging page states that both stderr and stdout are being collected.
I'm left wondering if something if my logback configuration - either default or custom behaviour I've introduced, is causing the SEVERE level entries to hit STDOUT when I really want them in STDERR to be picked up. It seems a reasonable assumption that Stackdriver Events can parse the json, and the docs imply its just the stream they come out on, rather than any smart detection of a stacktrace. Logging in java is really not my forte :-(
On further investigation, that seems to be the case. Logback ConsoleAppender will write to System.out unless overridden, and Stackdriver Errors will appear to ignore that stream if I'm readin the docs correct. I think this means to get the desired behaviour requires 2 appenders, with mutually exclusive filtering. Something like the config below.
<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<target>System.out</target>
<!-- accept anything that is not SEVERE -->
<filter class="ch.qos.logback.classic.filter.LevelFilter">
<level>SEVERE</level>
<onMatch>DENY</onMatch>
<onMismatch>ACCEPT</onMismatch>
</filter>
<!--https://stackoverflow.com/questions/44164730/gke-stackdriver-java-logback-logging-format-->
<encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
<layout class="cne.cinet.admin.GCPCloudLoggingJSONLayout">
<pattern>%-4relative [%thread] %-5level %logger{35} - %msg</pattern>
</layout>
</encoder>
</appender>
<appender name="STDERR" class="ch.qos.logback.core.ConsoleAppender">
<target>System.err</target>
<!--https://stackoverflow.com/questions/25935326/how-can-i-configure-logback-conf-to-send-all-messages-to-stderr-->
<!-- deny all events with a level below SEVERE -->
<filter class="ch.qos.logback.classic.filter.ThresholdFilter">
<level>SEVERE</level>
</filter>
<encoder class="ch.qos.logback.core.encoder.LayoutWrappingEncoder">
<layout class="cne.cinet.admin.GCPCloudLoggingJSONLayout">
<pattern>%-4relative [%thread] %-5level %logger{35} - %msg</pattern>
</layout>
</encoder>
</appender>
<root level="info">
<appender-ref ref="STDOUT"/>
<appender-ref ref="STDERR"/>
</root>
I will try and test later today
well, something still doesn't work.
This is my error, and if you check the labels in the json payload you can see stackdriver shows it coming in on stderr now. I compared to the same error an hour ago and confirmed it used to come over stdout.
Unfortunately Stackdriver errors is still blank. Unsure what the next step is.
11:18:42.615
45649 [http-nio-8080-exec-10] ERROR c.c.a.CustomRestExceptionHandler - Service call failed to ServletWebRequest: uri=/capacity;client=10.16.0.1;session=9D69C56AC265BF5335595B77DE2881D0;user=user
{
insertId: "1ach8wufgu8nkm"
jsonPayload: {
message: "45649 [http-nio-8080-exec-10] ERROR c.c.a.CustomRestExceptionHandler - Service call failed to ServletWebRequest: uri=/capacity;client=10.16.0.1;session=9D69C56AC265BF5335595B77DE2881D0;user=user"
}
labels: {
compute.googleapis.com/resource_name: "gke-staging-cluster-2-default-pool-56e59b49-p1w0"
container.googleapis.com/namespace_name: "cinet"
container.googleapis.com/pod_name: "cne-admin-5zrq1"
container.googleapis.com/stream: "stderr"
}
logName: "projects/cinet-staging/logs/cne-admin"
receiveTimestamp: "2017-10-11T22:18:47.406625199Z"
resource: {
labels: {…}
type: "container"
}
severity: "ERROR"
timestamp: "2017-10-11T22:18:42.615Z"
}
I think the fluentd exception detector might not be matching the lines from Logback. I suspect the log line prefix is throwing it off. See the detector here.
@igorpeshansky, any suggestions?
I adjusted the pattern to remove the logline prefix, but still no luck.
Am considering just dropping the autodetection efforts altogether, and dealing with error handling manually.
Feels like the systems are trying to be too clever. A formatted json payload with multiline entries, with severity=ERROR and coming in on stderr isn't good enough?
I've tried using the Google logback appender, but not seeing any content from it at all. Similar reason why used the custom solution above.
fwiw, most error log entries will be picked up by error reporting automatically. we shouldn't need to call the client lib explicility in most cases.
I'd love to see an actual working example of that. Haven't been diving into this for the sake of it ;-)
Attempting to use the Google Logback appender continually results in nothing showing up. My production config currently has the 2 custom json formatted appenders, going to stdout and stderr respectively, and the OOTB google appender, with no filtering. All running at root level.
That should result in both json and normal formatted lines coming through, and error events at the very least being picked up from the fluend detector over the unformatted google appender's output.
But no
Let's be clear here. The fluentd exception detector simply finds exception stack traces that span multiple lines (and thus may end up as multiple separate log messages) and joins them into single multiline messages. This is to enable the server-side integration between error reporting and logging. Unless that integration knows how to recognize the errors you're looking for, enabling the exception detection plugin won't help. That plugin also works only for line-level log messages — structured logs will be ignored by the plugin. Hope this explains things.
The Google Stackdriver Logback appender will not send anything to Error Reporting. I actually think that Stackdriver Logging might want to just automatically create error reports for anything logged at a very high severity level, but that's a separate discussion.
What should work is when an exception stack trace is printed to stdout as in ex.printStackTrace()
, it should be automatically collapsed into a single log message and also create an entry in Error Reporting. I've seen this work on App Engine Flex, and I believe it should work on GKE the same.
Here's an example to try on App Engine Flex (and can be adapted to GKE):
git clone https://github.com/GoogleCloudPlatform/getting-started-java
cd getting-started-java/helloworld-servlet
Modify HelloServlet
to look like this:
@SuppressWarnings("serial")
@WebServlet(name = "helloworld", value = "/" )
public class HelloServlet extends HttpServlet {
private final static Logger logger = Logger.getLogger(HelloServlet.class.getName());
@Override
public void doGet(HttpServletRequest req, HttpServletResponse resp) throws IOException {
PrintWriter out = resp.getWriter();
out.println("Hello, world - Flex Servlet");
logger.info("Test INFO level log");
logger.warning("Test WARN level log");
logger.severe("Test SEVERE level log");
throw new RuntimeException("Testing exception.");
}
}
mvn clean appegine:deploy
I actually think that Stackdriver Logging might want to just automatically create error reports for anything logged at a very high severity level, but that's a separate discussion.
That's exactly what I meant by "server-side integration between error reporting and logging". It's already happening.
What should work is when an exception stack trace is printed to stdout...
Even if you enable the exception detector for stdout, the integration only looks at stderr.
I think what happened in this comment was that there was no stack trace and the format of the exception line was, erm, unexceptional, so it wasn't detected.
I'm unsure why the presence of stacktraces is meaningful for Error Reporting. If something is logged to stderr, at severity=error, why should Stackdriver care about the message content? I might want notifcation on something that triggers on business logic failures, rather than actual exceptions. Is that not reasonable?
Most users don't want to be notified about every line they log to stderr (severity==error is the default for stderr). So Error Reporting detects things that are likely to actually be errors (namely, stack traces) automatically, and lets users do more if they wish. Details here.
This is a starter for how my project has addressed Spring Boot and Stackdriver Logging, at @meltsufin request on https://github.com/spring-projects/spring-boot/issues/8933#issuecomment-335305580
The challenge is that Stackdriver logging really needs JSON structured outputs, for things like correct timestamping, and multiline records. However for development we just wanted OOTB Spring logback to the console since we are familiar with it.
logback.xml
logback-production.xml
Layout class - its Kotlin but you get the idea, port of prior work discussed here https://stackoverflow.com/questions/44164730/gke-stackdriver-java-logback-logging-format
application.properties
application-production.properties
We are running on GKE, so launch pods with
SPRING_PROFILES_ACTIVE=production
in order for the right files and overrides to go in.gradle.build the relevant parts