census-instrumentation / opencensus-java

A stats collection and distributed tracing framework
https://opencensus.io
Apache License 2.0
672 stars 201 forks source link

implement gRPC client retry stats measures and views #2084

Closed mackenziestarr closed 2 years ago

mackenziestarr commented 2 years ago

Based on my conversation with the grpc-java core team, the retry stats implementation of grpc-java is complete but lacks opencensus views. This PR adds them according to the measures and views outlined in the A45-retry-stats proposal.

Some open questions:

mackenziestarr commented 2 years ago

@jsuereth added more sensible bucket boundaries for the retry per call stats in https://github.com/census-instrumentation/opencensus-java/pull/2084/commits/1c7d888695be472458fe2434c3192ff2d8fb8d6d suggested in the spec

given that maxAttempts cannot be greater than 5 i'm not really sure why >= 10, >=100, >=1000 are suggested there - still these are an improvement.

dapengzhang0 commented 2 years ago

@mackenziestarr

added more sensible bucket boundaries for the retry per call stats in 1c7d888 suggested in the spec

given that maxAttempts cannot be greater than 5 i'm not really sure why >= 10, >=100, >=1000 are suggested there - still these are an improvement.

Did you notice the notice in

https://github.com/grpc/proposal/blob/2d7ecf48d9a6aff7973a909be9b6552aaeafc9d6/A6-client-retries.md#retry-and-hedging-statistics

Notice: Retry statistics has been updated in the new design gRFC-A45. The original design below is obsolete.
mackenziestarr commented 2 years ago

Yes, i did notice it. Is there a better alternative you would suggest? I didn't see bucket boundaries specified in gRFC-A45.

dapengzhang0 commented 2 years ago

It does not make sense to have >= 10, >=100, >=1000 buckets for count of retry-attempts. Those buckets were addd in the very first version of gRFC-A6, and later gRFC-A6 imposed maxRetryAttempts<=5 but the buckets were not updated (probably overlooked). See https://github.com/ncteisen/proposal/commit/7354c1ac07fdb8e5aaf58ea086fbc0c42b5c2fca#diff-a1d8d0295ab02319b3f5cb1f351fc16727eff4a614373d11eef437c89efb35fbL178-R177

mackenziestarr commented 2 years ago

ah okay makes sense. I'll remove the extraneous buckets and make >= 5 the upper bound - does that sound good?

dapengzhang0 commented 2 years ago

ah okay makes sense. I'll remove the extraneous buckets and make >= 5 the upper bound - does that sound good?

Yep.

mackenziestarr commented 2 years ago

thanks @dapengzhang0 fixed in e597fe6

mackenziestarr commented 2 years ago

hi @jsuereth would you mind approving again (pending any changes that should be made) so CI can run?

mackenziestarr commented 2 years ago

@songy23 @rghetia can I get a review on this when you get a chance, thanks

punya commented 2 years ago

@mackenziestarr I work on the same team as @jsuereth and I plan to review/approve, merge and release this PR this week. Thanks for your patience!

mackenziestarr commented 2 years ago

thanks for updating those @punya, appreciate it :smile: