grpc-ecosystem / grpc-gateway

gRPC to JSON proxy generator following the gRPC HTTP spec
https://grpc-ecosystem.github.io/grpc-gateway/
BSD 3-Clause "New" or "Revised" License
17.97k stars 2.21k forks source link

Dynamic proxying, reflection service support #677

Open tmc opened 6 years ago

tmc commented 6 years ago

Given the capabilities of @jhump's https://github.com/jhump/protoreflect there is an opportunity to proxy without code generation and recompilation.

This likely implies some refactoring/extraction.

jhump commented 6 years ago

I've thought about this quite a bit, actually. At a former job, my team even wrote something that did just this (though it was bespoke protobuf-based RPC, not gRPC).

The complication we faced was how often to reload the proto schema. The proxy we built supported both protoset files (e.g. the proxy binary was only compiled once, but the service descriptors still had to be compiled and provided to the proxy whenever proto sources changed) and service reflection (this was pre-gRPC, so we had our own custom service that allowed for clients to download the server's descriptors). The complication was only an issue with reflection. We implemented it so that it basically re-downloaded the schema (and reconstructed the reflective proxy) every time it detected a new socket connection. But this occasionally caused issues because it was too often, particularly when servers had silly large descriptors. We did this to make sure that the reflective proxy was guaranteed to converge to a new schema after servers were rolling-restarted with a new version. (A better approach may instead be to just periodically poll, so that a storm of re-connects doesn't result in lots of wasted processing). Another issue we observed was startup lag -- if a given proxy was configured to talk to a lot of different services, there was noticeable latency during startup since it would not become healthy/available until after it had downloaded and processed all of the descriptors. Just some things to consider.

tmc commented 6 years ago

Those are great considerations. Thanks! I think for most use cases polling would be sufficient. A control protocol could be exposed to allow triggering of reloads or submissions of new descriptors. In terms of startup time, depending on design, descriptor loading could probably be parallelized and perhaps you could service requests for one service even while others are still loading. Thanks for your thoughts and experience here!