Open salliewalecka opened 2 years ago
DoorDash provides some good psuedo code for their interceptor. If you could point me to the analogous spot to add in the tracer for your server, that would also be appreciated.
Hello, any progress on this request? Or if you can point to some code where we can add the custom client interceptors. That will be great
Feature Request
If this is a feature request, please fill out the following form in full:
Describe the problem the feature is intended to solve
I want to close the the gap between latency as seen in
:tensorflow:core:graph_run_time_usecs_histogram_bucket
and latency as seen by the client by adding transport-level tracing. Then I will have additional metrics for network delay + request queuing and request serialization and deserialization on the server side. I've experienced high latencies that cannot be explained by tensorboard or the graph latency, which has turned out to be a blocker to launch some models.Describe the solution
I want to get metrics similar to DoorDash's implementation of this tracing using GRPC Interceptors. However, using client interceptors is not enough, as we need to have server side interceptors to be able to track the whole request lifecycle. Thus, we need the ability to add these interceptors that can report the request event lengths. I'm not sure what the exact mechanism should be to gather these metrics after they are created by the interceptors, but somehow getting these into the metrics endpoint prometheus scrapes.
Describe alternatives you've considered
We can't get the information we need with client-only metrics, and have looked through all other metrics offered by TF Serving and none of them help us explain extra non-graph latency. We've done latency tests from different points in our infrastructure, but having these metrics would be really valuable to pinpoint the source of latency.
System information