l7mp / stunner-gateway-operator

STUNner Kubernetes Gateway Operator
Apache License 2.0
15 stars 6 forks source link

Automatically expose health-check ports on LBs #22

Closed rg0now closed 1 year ago

rg0now commented 1 year ago

Related issue: #21

Many cloud providers support the assignment of health-check ports to publicly exposed Gateways/services via LB Service annotations. The below example will let the provider's LB to health-check our stunnerd pods over HTTP on the port TCP:8086.

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: Gateway
metadata:
  name: health-check-gateway
  namespace: stunner
  annotations:
    stunner.l7mp.io/service-type: NodePort
    stunner.l7mp.io/expose-health-check-port: true
    service.beta.kubernetes.io/do-loadbalancer-healthcheck-port: "8086"
    service.beta.kubernetes.io/do-loadbalancer-healthcheck-protocol: "http"
    service.beta.kubernetes.io/do-loadbalancer-healthcheck-path: "/live
spec:
  ...

The way this works is that the gateway operator will copy the annotations from the Gateway verbatim into the LB service that exposes it, and the provider's LB will then pick up the annotations from this LB service and bootstrap the health-checker.

Problem: Many providers require the health-check port to be explicitly exposed in LB services, which currently the operator does not support. In particular, the automatically created LB service should contain a service-port that covers the health-check port:

apiVersion: v1
kind: Service
metadata:
  name: health-check-gateway
spec:
  ports:
...
    - name: health-check-port
      port: 8086
      protocol: TCP
...
  type: LoadBalancer

Since the operator does not create the service-port automatically, users have to create them manually which is error-prone.

Solution: Automatically create the health-checker service-ports based on the service.beta.kubernetes.io/do-loadbalancer-healthcheck-* annotations. Since exposing health-check ports is fundamentally insecure, this feature should be explicitly enabled by the user by setting stunner.l7mp.io/expose-health-check-port: true. The port should come from service.beta.kubernetes.io/do-loadbalancer-healthcheck-port, the protocol should be TCP if service.beta.kubernetes.io/do-loadbalancer-healthcheck-protocol is TCP or HTTP (other protocols may be supported later if the need arises), and the name should be something like health-check-<protocol>-<port> (hopefully, this will be unique enough for now).

Implementation: see here. PR must come with unit tests and an integration test plus docs here.

Edited: Added feature gate.

rg0now commented 1 year ago

Should we make this operator specific? Should we strive to support more than one operator? Some background: