spring-attic / spring-cloud-gcp

Integration for Google Cloud Platform APIs with Spring
Apache License 2.0
704 stars 694 forks source link

Web Request Filters are not pushing traceId to Cloud Logging, so log lines are not grouped #2573

Closed NicolaSpreafico closed 3 years ago

NicolaSpreafico commented 3 years ago

Hello, I found similar issue but no an actual idea or resolution:

I have a Spring Boot Java11 application deployed on GCP App Engine Standard. My goal is to obtain log lines to be grouped under a common parent (the request log), so I can expand them and isolate them across different requests. In order to do so I understood I need to user the logback plugin, so starting from:

As test code I started from the Hello World project: https://cloud.google.com/appengine/docs/standard/java11/quickstart I add a simple log line in the controller image and it's working image

The problem here, which is why I'm opening this issue, is that logs generated by Request Filters are logged w/o a traceId, so the Logging Console is not able do group them.

I created 2 very simple filters, just for example, Filter1 and Filter2

@Component
public class Filter1 extends OncePerRequestFilter {
    private static final Logger LOG = Logger.getLogger("Filter1");

    @Override
    protected void doFilterInternal(HttpServletRequest arg0, HttpServletResponse arg1, FilterChain arg2)
            throws ServletException, IOException {

        LOG.info("Logger1 preFilter");
        arg2.doFilter(arg0, arg1);
        LOG.info("Logger1 postFilter");

    }
}

and I can see the log lines displayed image but because the filter log lines do not have any traceId, the parent view is not workin image

In my actual application I have a couple of actual filters that are providing log lines, and I would like very much to see them under the actual parent request. Is something that can be done? Am I missing some configuration?

ttomsu commented 3 years ago

My hunch is that there's something up with the filter ordering. I had a sample app handy in the debugger with the latest milestone of Spring Cloud:

Screenshot from 2020-11-03 14-08-04

You can see that there are 3 relevant filters: lazyTracingFilter myFilter and tracingFilter. The lazyTracingFilter delegates calls to the same instance of tracingFilter, so the traceId is being set before myFilter gets invoked.

Here is myFilter:

    @Bean
    public Filter myFilter() {
        return new OncePerRequestFilter() {
            @Override
            protected void doFilterInternal(HttpServletRequest req, HttpServletResponse resp, FilterChain filterChain) throws ServletException, IOException {
                log.info("Trace Before: " + MDC.get(StackdriverTraceConstants.MDC_FIELD_TRACE_ID));
                filterChain.doFilter(req, resp);
            }
        };
    }

And the resulting traces in Stackdriver:

3eBYjKEhatWHyTs

Can you hit a debug point and peek at your filter config? Does it look anything like mine? If not, what does yours look like?

NicolaSpreafico commented 3 years ago

Hello, I added the line you suggested in my filter: image (and I removed the Filter2 because for this example 1 filter only is enough, at the end we are talking about the traceId log attribute being missing, 1 or 10 filters, the behaviour is the same.

image In my case I'm getting a null so something is missing again.

The application is now very simple, I only have the main class and the filter. Do you have a working example where you see from Cloud Logging console that log lines generated by filters are actually grouped under the request parent?

ttomsu commented 3 years ago

Yep - you can find the changes on this commit: https://github.com/ttomsu/spring-cloud-gcp-old/commit/90350c18d540737b1cd6a3075e1352c2d983e599

You can run the sample and/or the test with the following steps:

  1. Authenticate: gcloud auth application-default login
  2. ./mvnw install -DskipTests -T 1C
  3. Run the application:
    ./mvnw spring-boot:run --projects spring-cloud-gcp-samples/spring-cloud-gcp-trace-sample/`
    1. Navigate to localhost:8080
    2. Observe trace ID in console
    3. Find trace in the Cloud Console: https://console.cloud.google.com/traces/list?tid=
  4. Run the integration test for the above:
    ./mvnw verify --projects spring-cloud-gcp-samples/spring-cloud-gcp-trace-sample/ -Dit.trace=true

Here's what I see in the Cloud Console:

SUDiPjZdZTmntY6 4waCEBtpicuyiJw

NicolaSpreafico commented 3 years ago

Hello, I followed your steps: 1) I downloaded https://github.com/spring-cloud/spring-cloud-gcp/archive/master.zip 2) Opened project spring-cloud-gcp-master\spring-cloud-gcp-samples\spring-cloud-gcp-trace-sample 3) Edited the files as your commit (filter implementation, logback xml modifications)

Then I wanted to try directly from online environment, so I deploy the application, with a couple of more changes 4) I disabled the checkstyle plugin which was blocking the deploy 5) I disabled (as suggested) the CONSOLE handler in the logback xml 6) I increased the instance up to F2 to avoid memory errors This is the command I used:

$ mvn clean package appengine:deploy -Dapp.deploy.projectId=xxxxx -Dapp.deploy.version=GCLOUD_CONFIG  -DskipTests=true

Then I simply hit the /meet endpoint and watched logs image It seems to be working, I can see the logs from the filter being grouped as well with the parent request.

So what is changed here compared to my previous test? Do I need to always enable tracing?

# To send 100% of traces to Stackdriver Trace
spring.sleuth.sampler.probability=1.0

If so, why this is needed only for Filter logs? W/o this the request logs are working fine.


EDIT For the first question, the answer seem to be... yes. I added the trace plugin to my project, enable it a 1.0 and then use the same log back configuration, I can now see the log filter traced: image

ttomsu commented 3 years ago

Yes, the tracing module needs to be added to your pom.xml to pick everything up.

Tracing and Logging are separate but related modules - AFAIK you can have one, the other, or both. If you have both, you get some nice features where you can jump from the Trace timing page into the logs that resulted from just that trace (which is the behavior I think you're after).

The spring.sleuth.sampler.probability=1.0 property ensures that all records get send to Cloud Trace. By default, only a subset (a sampled portion) of all requests get sent to Trace. After a certain threshold we start charging for the number of traces saved (see https://cloud.google.com/stackdriver/pricing)

I'm glad to see you've got it working! I'm going to close this issue for now, but feel free to reopen if you have more questions.

NicolaSpreafico commented 3 years ago

Ok thank you. For my actual case I don't mind about the Cloud Trace functionalities, while I really need to have the request log lines all grouped under the same parent or the logs became unreadable. With the default log configuration, the request log lines are working but somehow the log lines generated in filters are not managed as well.

With this additional configuration the filter log lines are also grouped, so it seems to be a very mandatory configuration for the log part, even if I don't need the Trace part.

ttomsu commented 3 years ago

Hi @NicolaSpreafico ,

I discussed this with my team a bit more and I think I have a better handler of what's going on - namely that you shouldn't have to add the trace module to get the functionality you're looking for.

Background context: When a request is being handled through Spring MVC, it goes through several stages: Filter --> Interceptors --> Business logic.

Filters are a concept specific to Java Servlets, on which Spring Boot is built. On the other hand, Interceptors are a pure Spring concept. Said another way, Filters are deeper in the stack.

Note: Spring actually mention in their Javadocs about the difference and when you may want to use one or the other.

When you just have the logging module included in your POM, we look for the x-cloud-trace-context header (that should be set by either AppEngine or the Google Cloud Load Balancer) using a HandlerInterceptor (namely the TraceIdLoggingWebMvcInterceptor). If this value is found, it's stashed away.

When we add the trace module, it activates Spring Sleuth, and deactivates our TraceIdLoggingWebMvcInterceptor. Our Cloud Trace integration delegates the same functionality to Spring Sleuth, who implements this logic as a Filter (hence the lazyTracingFilter and tracingFilter filters I pointed you to in my example.) This is why it simply worked out of the box for me - because I was using the wrong sample application - a sample application that had both logging and trace enabled. I'm sorry about that.

I added the same Filter to the correct Logging-only sample application and now see the same thing you do: com.example.Application - Filter Before: null

So ultimately you can choose to keep the Trace library added or play with a little Spring bean magic and implement your desired logic as an Interceptor. Here's a really simple example that just overrides our existing bean, but that may not work well if you have multiple things you could implement as Interceptors: https://github.com/ttomsu/spring-cloud-gcp/commit/1dfbba3fe826824986a760ec6ab0a2407a800059. Note that this commit goes against the logging sample, not the trace sample.

NicolaSpreafico commented 3 years ago

Hi @ttomsu, for the moment I'm keeping the trace plugin with spring.sleuth.sampler.probability=1.0 as we confirmed being working.

For future improvements I keep record of your suggestion, but right now it's enough as it is.

Thanks

NicolaSpreafico commented 3 years ago

Hello @ttomsu, while i'm carrying on with the Cloud Trace at 1.0, I find a different problem.

I have a second filter which manage all uncaught errors, like if I get a nullpointer somewhere or any kind of unmanaged error that need to be logged. I find that when is the case, the actual logs with severe level doesn't contain the trace id

Here is the log with the stacktrace inside the text content, you can see that there is not traceId (it is an illegalargumentexception) image

Is it possible that because the request crashed during execution the component which actually manage the logs has been deactivated?

ttomsu commented 3 years ago

I'm not sure of the root cause without more knowledge of when the NPE is thrown and what has been processed up to that point. Including the trace module actually makes things more complicated since we're delegating to Sleuth to manage the trace context instead of our simpler thread-local variable.

As a reminder - having Sleuth manage the trace context is separate from the sampler's probability. By setting spring.sleuth.sampler.probability=1.0 you're ensuring all timing data is sent to Cloud Trace - using up CPU/memory resources as well as using up your free tier of Cloud Trace quota.

NicolaSpreafico commented 3 years ago

@ttomsu I think I was mislead by previous case. In my actual project the problem it isn't that error log are not traced, the problem is that more than 1 filter is not traced at all. I have 3 different filters in my project and only one produces log which actually has trace ids.

In your previous examples did you tried more than 1 filter in order to see if all logs of all filters are actually traced? And because 2 of my filters has @Order(Ordered.HIGHEST_PRECEDENCE + 1) and @Order(Ordered.HIGHEST_PRECEDENC) I can tell you that only traced filter is the one w/o that annotation, so the last one to be executed in terms of priority. I don't if this is an indicator of something

I found this about filter priority, do you think is the actual cause? https://github.com/spring-cloud/spring-cloud-sleuth/issues/1425

meltsufin commented 3 years ago

@NicolaSpreafico Backtracking a bit, I just want to reiterate what @ttomsu already said about trace ID working without Sleuth. So, if you don't require tracing, you should remove spring-cloud-gcp-starter-trace. The whole trace ID propagation works by setting a thread-local variable, and in the absence of Sleuth, we set this thread-local variable in a Spring Web MVC HandlerInterceptor. My understanding is that Servlet filters are always processed before interceptors. So, the trace ID is not set yet set in thread-local when filters are processed. Consequently, you have two possible ways forward: 1) Replace your servlet filters with Spring MVC interceptors and make sure that they are registered after our TraceIdLoggingWebMvcInterceptor. 2) Create a servlet filter version of TraceIdLoggingWebMvcInterceptor, and register it as the first filter in the chain.