Closed jiamo closed 5 years ago
In case of http/2 and gRPC protocols such model will result in uneven load balancing. For these protocols you will want per-request load balancing, not per-connection load balancing. Gunicorn workers model can give you only per-connection load balancing. I think that it is better to place several grpclib servers behind proxy server, it could be Envoy, Nginx (since 1.13.10), HAProxy (since 1.9.2) or any other proxy server with native http/2 support.
thanks.
Would this method work inside Kubernetes? I have an application that sends data to a service that processes it, and would like to horizontally scale the processors.
With vanilla Kubernetes there is the same problem, connections to the services (using ClusterIP) are load-balanced on a per-connection basis. In order to make load distribution more even you can use service mesh (with sidecar proxies), e.g. Linkerd, Istio.
Thanks for the response, I have just one follow up question.
I'm looking at using Envoy as you mentioned in an above comment (I would like to avoid a service mesh if possible) to accommodate horizontal scaling, but my grpclib server is using stream-stream. I think to make this work, I would need to use unary-unary instead.
Is there a big difference between unary and bistream in terms of performance?
I think that performance is not the reason to choose stream-stream over unary-unary. Difference may be noticeable, grpclib is a pure-Python lib, headers/trailers parsing for every message sent is not free.
It is single process to run server . Is it possible to make server process like a sanic worker which can be a worker in gunicorn?