grpc / grpc

The C based gRPC (C++, Python, Ruby, Objective-C, PHP, C#)
https://grpc.io
Apache License 2.0
42.02k stars 10.57k forks source link

Memory tracking for GRPC internal allocations #34498

Open jkammerer opened 1 year ago

jkammerer commented 1 year ago

Is your feature request related to a problem? Please describe.

We use GRPC C++ to request large volumes of data from another service. This data is streamed to us using arrow flight with potentially large individual messages. Each request uses its own grpc::channel. To guarantee the stability of our service, we need to be able to track the memory that is allocated for a request. Especially the memory used to internally buffer up the incoming message needs to be tracked as we can have messages with many megabytes of data. It is important for us that this memory tracking happens before the allocation, such that, we can decide to either allow or forbid the allocation based on our memory budget.

Describe the solution you'd like

Here are some alternatives that would solve our problem:

Describe alternatives you've considered

Specifying a ResourceQuota for the grpc::channel does not work for us, as we want to track the allocations and dynamically decide if we have enough memory left in our budget to allow it. Furthermore, the ResourceQuota seems to not cover the buffers for the incoming message. In our experiments, we saw only 16kb being allocated against the quota while reading a 40mb message. The total process memory intermittently grew by ~40mb.

Looking at the C API in grpc::core we also considered CallSizeEstimate(), but this does not cover the message size either.

Our current workaround is as follows: We limit the MaxReceiveMessageSize of the channel to the biggest message sizes we would expect to receive and always register this amount in our memory tracking. This workaround is unsatisfactory as it forces us to:

Additional context

We opened a similar issue for this problem on the arrow flight level.

markdroth commented 1 year ago

@ctiller, any thoughts on this use-case? I suspect that we don't want to add hooks to run user code to decide whether to allocate any given amount, but maybe there are other things you can suggest to help with this use-case.

ctiller commented 1 year ago

I'd be wary about leaning too hard on ResourceQuota like mechanisms for this: we can provide an approximate bound on memory usage inside gRPC - and for your use case you should set something - probably around the size of a few inflight messages.

gRPC doesn't grant flow control for streams until there's a read outstanding, so my recommendation right now would be to change your protocol to include request size as initial metadata - then you can use our streaming/generic apis to first read the incoming initial metadata, determine the size (you'd need to trust the sender), and then make a decision on whether to read that message now or later.

I'd like to get that next message size into the API at some point too, but the amount of plumbing that it would take right now is prohibitive.