It is well known that the balance between max_workers and maximum_concurrent_rpcs, and number of pods/servers can be tricky. One important way to measure how this balance is doing, is by monitoring the work queue size in the thread pool. It would be a great addition to this lib.
I have hacked my way around it to do it for my use case, and I'm open to contribute here. What do you think? Follows the rough sketch of the code:
class ThreadPoolQueueMetricServerInterceptor(PromServerInterceptor):
"""This interceptor measures the size of the work queue in the
gRPC server thread pool, by the end of an execution.
"""
...
def _measure_queue_size(self):
# From the current thread, find its thread pool, and fetch its current queue size.
thread = current_thread()
# A thread has target arguments as follows. We get the work queue.
# executor_reference, work_queue, initializer, initargs
_, work_queue, _, _ = thread._args
self.queue_size_gauge.set(work_queue.qsize())
def intercept_service(self, continuation, handler_call_details):
def wrapper(behavior_fn, *args):
def new_behavior(request, context):
# Call original behaviour
response_or_iterator = behavior_fn(request, context)
self._measure_queue_size()
return response_or_iterator
return new_behavior
return self._wrap_rpc_behavior(continuation(handler_call_details), wrapper)
Hi!
First, thanks for the contribution so far.
It is well known that the balance between
max_workers
andmaximum_concurrent_rpcs
, and number of pods/servers can be tricky. One important way to measure how this balance is doing, is by monitoring the work queue size in the thread pool. It would be a great addition to this lib.I have hacked my way around it to do it for my use case, and I'm open to contribute here. What do you think? Follows the rough sketch of the code: