tikv / grpc-rs

The gRPC library for Rust built on C Core library and futures
Apache License 2.0
1.81k stars 253 forks source link

Collect the gRPC Poll CPU utilization #563

Open JmPotato opened 2 years ago

JmPotato commented 2 years ago

Is your feature request related to a problem? Please describe.

As mentioned in https://github.com/tikv/tikv/issues/12139, gRPC Poll is one of the major parts of CPU consumption on the hot path, so collecting its CPU utilization will be very useful.

Describe the solution you'd like

Each CompletionQueue has its own thread and each gRPC request will only create and resolve all events inside a single CompletionQueue to reduce the context switch. In each thread, the poll_queue takes up essentially all of the CPU consumption.

https://github.com/tikv/grpc-rs/blob/ccd0fde14b90a9769197cadf2bb657c87de20faf/src/env.rs#L13-L35

So basically we can collect the thread CPU times and map the usage with each gRPC request easily. However, considering this crate should be a generic library, introducing this kind of feature should also be well-defined and easy to extend and use for the crate users, so a proper way to implement it should be discussed first.

BusyJay commented 2 years ago

Interesting idea! How to map the usage to requests? And what's the performance impact?

JmPotato commented 2 years ago

Interesting idea! How to map the usage to requests? And what's the performance impact?

We can start an independent thread in the background to sample at a fixed frequency and each time the thread CPU times is sampled, we attach it to the current gRPC handler method name or gRPC context in the corresponding thread. After the tag.resolve() has been finished, we can retrieve the corresponding information and save it.

This idea is pretty similar to pprof or resource_metering in TiKV.

As for the performance, asynchronizing the above processes as much as possible minimizes the performance impact, but performance loss still exists in theory, and the exact overhead may need to be implemented and tested to be known.

BusyJay commented 2 years ago

Can the collection be implemented as a standalone crate so it can be used directly in libraries instead of reinvent the wheel? Does grpc census help in this case?