GoogleCloudPlatform / gke-networking-recipes

Apache License 2.0
296 stars 83 forks source link

Single Cluster Regional LB (gke-l7-rilb) https example #62

Open boredabdel opened 2 years ago

boredabdel commented 2 years ago

Single Cluster Global LB with gke-l7-rilb gatewayClass and https from client to LB

soudaburger commented 1 year ago

It's now been a year. This is the exact example I need to implement, but documentation is severely lacking on how to do this with TLS. All examples seem to show the global load balancer, but since GCP doesn't support managed certificates for regional LBs, I'm not entirely sure how this is expected to be implemented.

neoakris commented 10 months ago

Hi all this was harder than it should have been, I figured it out while I was looking into something else mildly related. I too found a lack of examples so I'll post my methodology here, I figured it out via trial and error, but I can confirm it works.

Tricky Parts that weren't well documented:


Setting some Bash Shell Env Vars

export CLUSTER_NAME=cluster-1
export REGION=us-central1
export ZONE=us-central1-c
export PROJECT=chrism-playground-369416
export DOMAIN=neoakris.dev
gcloud config set project $PROJECT

HTTPS Cert Prep Work:

[shell@dockerized-ACME-client:/]

(Note: /lego is intentionally using full path to binary, lego alone will say lego not found in path)

/lego --email "your-email@your-domain.com" --domains="*.neoakris.dev" --dns "manual" run

Press Y to accept TOS

Following directions to manually update DNS with a _acme-challenge.neoakris.dev record according to the CLI feedback

2023/08/17 03:43:43 [INFO] [*.neoakris.dev] acme: Preparing to solve DNS-01

lego: Please create the following TXT record in your neoakris.dev. zone:

_acme-challenge.neoakris.dev. 120 IN TXT "y1HOVUQthxMIBcQLYTn18j5pbwWGxOK770g4_wvV7Tw"

lego: Press 'Enter' when you are done

^-- Manually logged into DNS admin portal to create a TXT record, then pressed enter

#

2023/08/17 03:45:07 [INFO] [*.neoakris.dev] acme: Validations succeeded; requesting certificates

2023/08/17 03:45:08 [INFO] [*.neoakris.dev] Server responded with a certificate.

[shell@dockerized-ACME-client:/]

exit

[admin@workstation:~/cert]

ls

.neoakris.dev.crt .neoakris.dev.issuer.crt .neoakris.dev.json .neoakris.dev.key

^-- These files were created by lego, cli

(Public Internet CA of Lets Encrypt provided the cert and key, signed by a public internet CA.crt

that's baked into operating systems / doesn't require additional configuration to work.)

gcloud compute ssl-certificates create my-imported-cert --certificate=.neoakris.dev.crt --private-key=.neoakris.dev.key --region=$REGION

^-- this lets it be attached to GCP managed internal Regional L7 LBs to termiante HTTPS at the LB

v-- some verification commands

gcloud compute ssl-certificates list gcloud compute ssl-certificates describe my-imported-cert


----------------------------------------------------------------------------------

**Subnet and Reserved Static Private IP Prep Work for gke-l7-rilb class of GKE Gateway API Controller:**  
* Note: default VPC auto created subnets reserve 10.128.0.0/9 (10.128.0.1 - 10.255.255.255)  
  that means 10.(0-127).x.y is fair game, so in the below commands I arbitrarily chose to reserve  
  10.127.127.0/24 to be the subnet dedicated to us-central1's Internal Regional LBs.
* This thing just has to exist in the region your GKE cluster exists in
```shell
export LB_SUBNET_NAME=regional-managed-proxy-only
export GKE_HTTPS_GATEWAY_LB_IP_NAME=https-gateway-lb-private-ip

gcloud compute networks subnets create $LB_SUBNET_NAME --purpose=REGIONAL_MANAGED_PROXY --role=ACTIVE --region=$REGION --network=default --range=10.127.127.0/24

gcloud compute addresses create $GKE_HTTPS_GATEWAY_LB_IP_NAME --purpose=SHARED_LOADBALANCER_VIP --region=$REGION --subnet=default
# ^-- pre-creates a reserved internal static IP, for gke-l7-rilb,
#      not needed but good for consistency between IaC based tear downs
#      Thing that's super unintuitive is the docs make it sound like this should
#      reference --subnet=$LB_SUBNET_NAME but that leads to cryptic non-helpful 
#      error / failure when provisioning gateway.yaml
#      Another stupid thing is all those purpose flags are important... for this to work right

export GKE_HTTPS_GATEWAY_LB_IP_VALUE=$(gcloud compute addresses describe $GKE_HTTPS_GATEWAY_LB_IP_NAME | grep address: | cut -d ' ' -f 2)
# ^--looks up the value and stores in shell env var
echo $GKE_HTTPS_GATEWAY_LB_IP_VALUE
# 10.128.0.37 
# ^-- This is a reserved static private IP that will be used by the LB, so you can configure DNS in advance if you like

Step 1: provision a GKE standard zonal sandbox cluster (1 node was enough for testing)

Step 2: Enable GKE Gateway API Controller

Step 3: Install an example app and tester pod, so we'll be able to test / verify working as expected

helm upgrade --install podinfo oci://ghcr.io/stefanprodan/charts/podinfo --namespace default
kubectl get svc 
# ^-- service is named podinfo   listens on 9898

kubectl run -it curl --image=docker.io/curlimages/curl -- sh
# [shell@pod-with-curl-that-can-talk-to-private-ip-lb: ~]
exit 
# [admin@workstation:~]

Step 4: Deploy Gateway API Resources to provision Internal L7 Regional LB

tee gateway.yaml  << EOF
apiVersion: gateway.networking.k8s.io/v1beta1
kind: Gateway
metadata:
  name: internal-https-gateway
  namespace: default
spec:
  gatewayClassName: gke-l7-rilb # Regional Internal LB
  listeners:
  - name: http
    protocol: HTTP
    port: 80
    allowedRoutes:
      kinds:
      - kind: HTTPRoute
      namespaces:
        from: All
  - name: https
    protocol: HTTPS
    port: 443
    tls:
      mode: Terminate
      options:
        networking.gke.io/pre-shared-certs: my-imported-cert # <-- made with 'gcloud compute ssl-certificates ...'
    allowedRoutes:
      kinds:
      - kind: HTTPRoute
      namespaces:
        from: All
  addresses:
    - type: NamedAddress   # Allows use of pre-provisioned predictable IP, vs dynamicly provisioned.
      value: "https-gateway-lb-private-ip"  # <-- created earlier `gcloud compute addresses list`
EOF

tee httproute.yaml  << EOF
kind: HTTPRoute
apiVersion: gateway.networking.k8s.io/v1beta1
metadata:
  name: example
  namespace: default
spec:
  parentRefs:
  - name: internal-https-gateway #<-- reference to gateway's name, must match
  hostnames:
  - "gateway-api-example.neoakris.dev"
  rules:
  - backendRefs:
    - name: podinfo
      port: 9898
EOF

kubectl apply -f gateway.yaml
kubectl apply -f httproute.yaml

# kubectl describe (both objects), to verify both have finished (takes 2-5 min)

Step 5: Test / Verify

neoakris commented 10 months ago

o right btw this isn't how I'd do this, if I was going to do it for real. (I did it this way as I was looking into something for a customer of DoiT International, https://doit.com, GCP Partners with great support at no cost to customers.)

Unfortunately for whatever reason GCP doesn't support managed certs (as in auto provision auto rotate) for private IP / internal LBs. (source: https://cloud.google.com/kubernetes-engine/docs/concepts/gateway-security#tls-support) (I say unfortunately, because there's no valid reason why they can't support it from a technological standpoint, just seems to be an unwillingness to prioritize development of the functionality.)

I'd combine GCP LB controller (Ingress/GatewayAPI)'s ability to upload a HTTPS cert embedded in a kube tls secret to the GCP managed LB, with cert-manager.io (a cloud agnostic method of getting a software bot managed cert that auto provsions and auto rotates, only difference is it's embedded in a kube secret).

https://cert-manager.io is a "kubernetes operator" (software bot / app running as a pod in kube cluster)
it can auto provision and auto rotate a wildcard cert *.neoakris.dev provisioned using DNS ACME challenge
against "Lets Encrypt" a free Public Internet Certificate Authority. (as in the HTTPS cert it gives will be signed by a CA, where the trust of that CA is baked into modern operating systems, so there's no need to add a CA as trusted.)

The cert-manager.io kube operator would create a kube secret of type tls (containing the wildcard cert, and it'd keep it provisioned / auto rotated, basically the software bot would manage the HTTPS cert for you)

Then I'd configure the Gateway API kube custom resources to reference the kube tls secret instead of a GCP pre-shared-cert.

That way it'd be maintenance free / no need to manually rotate certs, and wouldn't have to mess with self-managed CA / PKI.