weaveworks / common

Libraries used in multiple Weave projects
Other
129 stars 92 forks source link

Fix issue where successful streaming requests are not always captured in metrics #295

Closed charleskorn closed 1 year ago

charleskorn commented 1 year ago

According to https://pkg.go.dev/google.golang.org/grpc#ClientConn.NewStream, it's safe to not call Recv() on a grpc.ClientStream until it's exhausted if you clean up the stream in other ways:

To ensure resources are not leaked due to the stream returned, one of the following actions must be performed:

  1. Call Close on the ClientConn.
  2. Cancel the context provided.
  3. Call RecvMsg until a non-nil error is returned. A protobuf-generated client-streaming RPC, for instance, might use the helper function CloseAndRecv (note that CloseSend does not Recv, therefore is not guaranteed to release all resources).
  4. Receive a non-nil, non-io.EOF error from Header or SendMsg.

However, the current implementation of instrumentedClientStream does not correctly handle choices 1 and 2: if the ClientConn is closed and no further action is taken with the stream, or if the context provided is cancelled, metrics for the request are not recorded.

This PR addresses this issue for choice 2 (context cancellation). The fix is inspired by opentracing-contrib/go-grpc's equivalent implementation: source

Fixing the issue for choice 1 (calling Close on ClientConn) is more involved, so in the interests of keeping this PR small, I've chosen to focus on just the context cancellation case here.

charleskorn commented 1 year ago

Will be moved to grafana/dskit once https://github.com/grafana/dskit/pull/342 is merged

charleskorn commented 1 year ago

Closing in favour of https://github.com/grafana/dskit/pull/344