cirruslabs / orchard

Orchestrator for running Tart Virtual Machines on a cluster of Apple Silicon devices
Other
192 stars 15 forks source link

Suggestion: (optionally) separate HTTP and gRPC service ports #102

Open ruimarinho opened 1 year ago

ruimarinho commented 1 year ago

Hi @edigaryev,

This is a follow up to https://github.com/cirruslabs/orchard/pull/90 and https://github.com/cirruslabs/orchard/pull/93 which made it possible to use orchard behind an nginx ingress in Kubernetes.

Just to recap, the orchard client started failing to forward ports once it was put behind an nginx ingress. Initially, I though it could be related to the lack of certificate when using certificate-less bootstrap tokens, but @edigaryev was spot on that the missing piece of the puzzle was enabling the GRPCS backend for the /Controller endpoint on nginx ingress which proxies traffic to the orchard controller1.

This solution worked intermittently, resulting in error messages from nginx such as:

2023/06/27 12:30:38 [error] 2282#2282: *83097928 upstream timed out (110: Operation timed out) while reading response header from upstream, client: 10.10.10.100, server: orchard.example.internal, request: "POST /Controller/Watch HTTP/2.0", upstream: "grpcs://10.10.10.100:443", host: "orchard.example.internal:443"

or

2023/06/28 11:51:03 [error] 1253#1253: *1194409 no connection data found for keepalive http2 connection while sending request to upstream, client: 10.10.10.100, server: orchard.example.internal, request: "POST /Controller/Watch HTTP/2.0", upstream: "grpcs://10.10.10.100:443", host: "orchard.example.internal:443"

After spending a bit more time investigating the root cause, it became apparent that many other users were facing a similar issue - nginx struggling to multiplex HTTP/1.1 (and perhaps HTTP/2.0 too) and gRPC streams.

One the suggestions I've seen so far is to limit the number of keepalive connects to upstreams to 1, effectively requiring a new connection to the backend for every client that connects to nginx. Unfortunately, the nginx ingress on Kubernetes uses a single upstream {} block managed by a Lua script, so this setting has the downside of affecting all upstream connections. For high-volume systems, this isn't a long-term solution.

The configuration needs to be performed on nginx ingress controller ConfigMap:

apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-ingress-controller
data:
  upstream-keepalive-requests: "1"

This will add keep_alive requests 1; to the global upstream directive.

Once nginx applied this change, I suddenly started receiving a different error when using port forwarding - this time, the gRPC stream would be terminated by RST_STREAM, suggesting the connection was being aborted.

{"level": "warn", "ts": 1687962870.3219981, "msg": "failed to watch RPC: rpc error: code - Internal desc - stream terminated by RST_STREAM with error code: INTERNAL_ERROR"}

Turns out that for bi-directional gRPC streams, the default timeout values from nginx are too low - 60 seconds. That meant the upstream gRPC stream would be closed after that period of time. To fix this, I've had to increase a few settings:

nginx.ingress.kubernetes.io/server-snippet: |
      grpc_read_timeout 3600s;
      grpc_send_timeout 3600s;
      client_body_timeout 3600s;

I'm still getting ocasional 502s for which I don't have a good explanation yet. One option would be to try to reconnect on a 502, but I'm unsure if grpc_next_upstream works with a single backend (I'm only running a single orchard-controller instance).

nginx.ingress.kubernetes.io/configuration-snippet: |
  grpc_next_upstream error timeout http_502 non_idempotent;
  grpc_next_upstream_tries 3;

Overall, I believe orchard-controller would benefit from an optional segregation between its REST and gRPCs APIs to isolate traffic and avoid potential downsides from unexpected proxy behaviour.

1 The orchard controller exposes a gin-tonic REST API and gRPC server on the same port and then uses the content-type header to direct traffic to one or the other.

edigaryev commented 1 year ago

Hi @ruimarinho đź‘‹

Would it be possible for you to provide a minimally reproducible Kubernetes configuration (YAML's) that triggers 502s?

I really think that this would be a much better way of solving your issue because it'll reveal the root-cause in the nginx configuration (or even code).

With regards to adding an option to enable the gRPC to listen on a separate port, I do not entierly understand how it will help in your case, for example, if upstream {} will be global for all upstream connections, it wouldn't matter if the Orchard Controller will expose two ports as upstream connections because the global upstream {} settings would still apply to both of them. Or I am missing something here?