grpc-ecosystem / grpc-opentracing

OpenTracing is a set of consistent, expressive, vendor-neutral APIs for distributed tracing and context propagation
BSD 3-Clause "New" or "Revised" License
472 stars 98 forks source link

Add OpenTracing support for Go streaming RPCs. #25

Closed rnburn closed 6 years ago

rnburn commented 7 years ago

This PR adds additional interceptors to otgrpc so that streaming RPCs can be traced.

One of the more challenging problems with streaming RPCs is figuring out when to finish the span on the client-side. Unlike other interceptors, gRPC's client-side stream interceptor functions don't enclose the entire RPC, but rather return a ClientStream that the invoker then interacts with. Based on the error codes returned by ClientStream's methods you can figure out in most use-cases when the span finishes, though there's special cases that are problematic (example).

I added code for the problematic cases that at least eventually finish the spans and clean up resources. And I reached out on StackOverflow to see if to see if there's any other solution, but I'm not sure it's possible with the current API to do any better.

I also added testing coverage that follows the structure of this test in gRPC Go.

tedsuo commented 7 years ago

LGTM 👍

danielamiao commented 7 years ago

From a coding/implementation perspective, this all looks good to me, but I’m concerned about the general idea/design of a single span around a streaming client or server, which can last up to hours. This definitely breaks the definition of a span, so I think it's worthwhile to step back and discuss the general paradigm of instrumenting a streaming RPC.

Some quick thoughts (all IMHO):

  1. The top span of a streaming RPC should be finished when the stream is successfully established, this applies to both client and server streams.
  2. Each subsequent message sent on the above stream should have its own span, finishing when the SendMsg call finishes (could be an error).
  3. Similarly, each RecvMsg call should have its own span.
  4. For parent-child relationships, one way is to make all spans of messages on the stream children of the parent top span in 1), but that'll likely create huge long traces, so I don't actually think we should do it.
rnburn commented 7 years ago

I agree that tracing individual requests in the stream would be useful, but gRPC only allows you to send metadata when you initiate the stream, so I don’t see how you could propagate context if you created a span per-request.

I can see creating a span for the entire lifetime of the stream as being useful for shorter lived streams. I’m not sure what you could do for long-lived streams without being able to send context for individual requests.

bcotton commented 6 years ago

Any motion on this PR? It would be nice to have in the official release.

Thanks

trusch commented 6 years ago

Any motion on this would indeed be nice. From my point of view tracing complete streams instead of every message is better than not having tracing support for streaming rpc calls at all.

My personal problem is that the master of moby/buildkit depends on that PR, and that breaks stuff on my side.

Thanks in advance!

tedsuo commented 6 years ago

Merging this in. Sorry for the wait!

FYI: this repo is about to be moved to https://github.com/opentracing-contrib/go-grpc, and future development will happen there, so that we can monitor PR activity better and with more accountability.