grpc-ecosystem / grpc-gateway

gRPC to JSON proxy generator following the gRPC HTTP spec
https://grpc-ecosystem.github.io/grpc-gateway/
BSD 3-Clause "New" or "Revised" License
18.29k stars 2.25k forks source link

Support improved tracing integration by exposing router / path earlier #3700

Open jalaziz opened 1 year ago

jalaziz commented 1 year ago

🚀 Feature

I should note that I am trying to integrate the Datadog APM, but I believe this applies to OpenTracing/OpenTelemetry as well.

In trying to integration the Datadog APM, I realized that the the annotated context containing the path pattern is available only after HTTP handlers are called. At a high level, I understand why this is the case, but it complicates adding context to telemetry.

For example, when integrating the Datadog APM, the most straightforward way is to use the WrapHandler function:

mux := runtime.NewServeMux()
handler = httptrace.WrapHandler(mux, "service_name", "")

The second parameters here is meant to be the resource name. Datadog also offers a WithResourceNamer to dynamically generate the resource name (typically the HTTP method and path pattern). Unfortunately, WithResourceNamer works at the handler level and therefore it cannot be used to access the path pattern, just the full URL.

When looking at how Datadog integrates with other HTTP libraries, including the standard ServerMux, there is usually some capability to lookup the handler and route based on the request.

Unfortunately, grpc-gateway doesn't expose the route matching logic in a way that can be accessed outside of ServeHTTP, making it very difficult to know the parameterized route at the "wrapping" layer.

Ultimately, it would be nice if the route matching logic could be extracted from ServeHTTP in a way that it was both fast and re-usable to be able to lookup the route given a request.

Workarounds

Yes, I could also use the gRPC client/server integration as the docs suggest. In fact, I also do this. However, telemetry at the HTTP layer is very useful.

As a workaround, I am able to use runtime.WithMetadata as recommended here to inject the resource name as a span tag, but it feels a bit hacky and disjointed:

runtime.WithMetadata(func(ctx context.Context, req *http.Request) metadata.MD {
    span, ok := tracer.SpanFromContext(ctx)
    if ok {
        if path, ok := runtime.HTTPPathPattern(ctx); ok {
            span.SetTag(ext.ResourceName, req.Method+" "+path)
        } 
    }
    return metadata.MD{}
}),
johanbrandhorst commented 1 year ago

Thanks for your issue! What are you proposing exactly? I don't have enough context to come up with a solution here. Do you want new API from the runtime package? New methods on the ServeMux? I'm happy to discuss pros and cons and review and PRs that may stem from the discussion :).

jalaziz commented 1 year ago

@johanbrandhorst Apologies for not being more clear. To be honest, it's a bit ambiguous because looking at the source code, it's not clear what the best path forward is.

What is clear is that whatever it is, it should allow looking up the path pattern (and maybe the gRPC method) from a given HTTP request parameter.

Looking through the source code, I think the method would need be on ServerMux. At a very high level, I was thinking it may be possible to refactor the routing logic from ServeHTTP into a separate method that can be called independently. What I haven't looked into, though, is how easily that could be optimized to avoid paying the matching price multiple times for the same request.

johanbrandhorst commented 1 year ago

Thanks for the added detail. Is runtime.HTTPPathPattern not sufficient? I'm not sure about mapping that to a grpc method, you might have to do that in a grpc interceptor instead.

jalaziz commented 1 year ago

@johanbrandhorst runtime.HTTPPathPattern only works after annotateContext is called which happens within the handler itself. So it's only useful at the gRPC layer or by passing a func to WithMetadata.

What I'm looking for is the same information, but at the ServerMux/http.Handler level.

johanbrandhorst commented 1 year ago

I see. I don't even know how we'd do that tbh. The information around paths lives in the generated files, and they're not used until the handler is called. Perhaps this metadata workaround is the best we can do without significant re-architecting? Let me know if you come up with some solution, I'm still happy to consider it.