open-telemetry / opentelemetry-java-instrumentation

OpenTelemetry auto-instrumentation and instrumentation libraries for Java
https://opentelemetry.io
Apache License 2.0
1.88k stars 823 forks source link

Incorrect trace name and trace group when instrumenting Spring Cloud Gateway using automatic java agent #5479

Open wwwlll2001 opened 2 years ago

wwwlll2001 commented 2 years ago

**Describe the bug** For I now I adopt open telemetry agent of java to instrument my system which architecture is micro-service including a api-gateway which is built with Spring Cloud Gateway.

For the tracing architecture, it‘s open telemetry agent + otel-collector + data-prepper + opensearch trace analytics, so far i could got the tracing view on my opensearch, it works fine.

The bug(maybe it's not) is that all the name of the trace of Spring Cloud Gateway is /, so that it's hard for me to differentiate those multiple traces which have different request url so that I can not perform further analysis. After long time digging, I suspect it's somehow related with Netty, as when I instrumenting a normal spring boot service, the name is correct which is the request url, but when I instrumenting spring cloud gateway the trace name will be / or /**, this is what I found for now.

I hope the trace name of spring cloud gateway could be the same as other normal spring boot service which is not based on Netty

1646236625268 1646236366604

Steps to reproduce using opentelemetry java agent to launch spring cloud gateway service , then the trace name will be /* or /**

What did you expect to see? I hope the trace name of spring cloud gateway could be the same as other normal spring boot service which is not based on Netty

What did you see instead? all the name of the trace of Spring Cloud Gateway is /* or /**

What version are you using? opentelemetry-javaagent - version: 1.10.1

Environment

Additional context

trask commented 2 years ago

hi @wwwlll2001! does Spring Cloud Gateway use some form of routes internally? that's how we are able to get nice "route-based" span names, e.g. by instrumenting Spring MVC, JAXRS, etc...

wwwlll2001 commented 2 years ago

@trask thanks for your attention, in these two days, i keep digging this issue, and for now i think i have some findings.

currently my issue is that the trace group / trace name of all the request to spring cloud gateway will turn into "/*" eventually. After 2 day's digging, i find that below code is where the opentelemetry agent try to get the route of spring cloud gateway: Class WebfluxSingletons.java, method HttpRouteGetter httpRouteGetter() as below

public static HttpRouteGetter<ServerWebExchange> httpRouteGetter() {
    return (context, exchange) -> {
      PathPattern bestPattern =
          exchange.getAttribute(HandlerMapping.BEST_MATCHING_PATTERN_ATTRIBUTE);
      return bestPattern == null ? null : bestPattern.getPatternString();
    };
  }

The attribute HandlerMapping.BEST_MATCHING_PATTERN_ATTRIBUTE should be set by Spring Cloud Gateway, and our instrumenter could get it and manipulate it for our trace name / trace group, of course, this is what i am speculating. But Unfortunately, Spring Cloud Gateway will not set this attribute unless I config some RouterFunctions which I do not need. So, I can not get the real trace name / trace group, i can not say i am right , but i suspect it.

Above is what I have found so far, and I think i could not do more, but I do need to get the real trace name / trace group rather than /*

Any advice will be highly appreciated, thanks in advance

meiyese commented 1 year ago

Hi, we have the same issue,

Any updates for this?

@wwwlll2001 @trask

trask commented 1 year ago

hi @meiyese!

Do you have any findings related @wwwlll2001 comment above?

Spring Cloud Gateway will not set this attribute unless I config some RouterFunctions which I do not need

wwwlll2001 commented 1 year ago

@meiyese sorry for late response, in fact, I have not found solution or work around, sorry

ugard commented 1 year ago

Hi, we also have the same issue - we're using azure telemetry agent, which inside has opentelemetry agent shaded into.

There are some exchange attributes set by spring-gateway if using PathRoutePredicateFactory:

and there are some micrometer Tags provided (outcome, httpMethod, routeId, routeUri, path) - those are set, but somehow not propagated to azure insights by default...

But the real question is what are we expecting to see as name:

- id: user-service
  uri: http://user-service
  predicates:
    - Path=/api/v1/users/**

and gateway has no idea about {userId} field after /users/ path

so on gateway side I can set the name to:

GET /api/v1/users/** would probably be more useful than / but may not be present when not using Path predicate...

trask commented 1 year ago

GET /api/v1/users/** would probably be more useful

this makes sense to me 👍

but may not be present when not using Path predicate...

can you explain this a bit further?

ugard commented 1 year ago

there are multiple ways to configure where to pass request - https://cloud.spring.io/spring-cloud-gateway/multi/multi_gateway-request-predicates-factories.html

we're using path predicate:

spring:
  cloud:
    gateway:
      routes:
      - id: host_route
        uri: https://example.org
        predicates:
        - Path=/foo/{segment},/bar/{segment}

in our case -Path=/api/v1/users/**

PathRoutePredicateFactory sets GATEWAY_PREDICATE_MATCHED_PATH_ATTR with matched pattern: https://github.com/spring-cloud/spring-cloud-gateway/blob/6f952673170713a3a393840159485267177deb6a/spring-cloud-gateway-server/src/main/java/org/springframework/cloud/gateway/handler/predicate/PathRoutePredicateFactory.java#L110

but if you're not using Path= in your config (f.x. using Host Route Predicate Factory) - then that won't be setup, because routing decision would be made by some other criteria than path

In our case as a workaround I've extended PathRoutePredicateFactory so it wraps GatewayPredicate with my code:

private final Predicate<ServerWebExchange> delegate; //from PathRoutePredicateFactory

public boolean test(ServerWebExchange serverWebExchange) {
    var result = this.delegate.test(serverWebExchange);
    if (result) {
        var attrs = serverWebExchange.getAttributes();
        attrs.putIfAbsent(HandlerMapping.BEST_MATCHING_PATTERN_ATTRIBUTE, attrs.get(ServerWebExchangeUtils.GATEWAY_PREDICATE_MATCHED_PATH_ATTR);
    }
    return result;
}

disable original PathRoutePredicateFactory by spring config and register my override over it. It seems to work on Spring 3.1, but on Spring 2.7 (which we're migrating off) some spring metrics filter throws ClassCastException - I guess code expects, that if BEST_MAPPING_PATTERN_ATTRIBUTE is present, there is also some timing data.