aenix-io / cozystack

Free and Open Source PaaS-platform for seamless management of virtual machines, managed Kubernetes, and Databases-as-a-Service
https://cozystack.io
Apache License 2.0
625 stars 30 forks source link

Ingress addon node pods are not labeled with their role #209

Closed kingdonb closed 4 days ago

kingdonb commented 3 weeks ago

In Cozystack 0.8:

This might be a report for KubeVirt, as I haven't checked their docs to see whether this is an expected behavior or not.

When I bring up a cluster and add the ingress-nginx addon, I see that some infrastructure is created to route traffic from the outside in. It looks like a few issues conspired (#208, #200) to make this not work, I see also that:

is in the pipeline, which includes upgrades from a lot of system components, including KubeVirt, and maybe that will help this issue. I haven't checked yet.

The remaining issue here is that when my cluster in tenant-root named harvey gets created with the ingress addon, this ingress: kubernetes-harvey-ingress-nginx with the similarly named backend kubernetes-harvey-ingress-nginx selects the pods representing cluster nodes, which all have the ingress-nginx role:

selector:
  cluster.x-k8s.io/cluster-name: kubernetes-harvey
  node-role.kubernetes.io/ingress-nginx: ""

The nodes indeed all got assigned the ingress-nginx role at create time by the Helm chart values, but the node-role.kubernetes.io/ingress-nginx label never made it. So the endpoints with matching selector has zero backends, and traffic is not routed.

Applying these pod labels to the kubevirt-launcher pods and fixing #200, #208, seems to enable the passthrough TLS that I was hoping to see!

kingdonb commented 3 weeks ago

Testing #207, this issue does appear to persist.

Adding the label to one or both of the pods representing a virtual machine in the cluster, completes the circuit and makes the traffic flow, without the label added to either pod you only get:

503 Service Temporarily Unavailable

because:

% k get endpoints
NAME                                ENDPOINTS                                                        AGE
kubernetes-harvey-ingress-nginx     <none>                                                           154m
kingdonb commented 3 weeks ago

Once we resolve this issue and labels are applied to the pod nodes there is still some issue. If the frontend receives a port 80 request, ideally we are passing it through to the backend on port 80 (which should redirect the request to port 443, except if it's another front-end router, then it should behave just like passthrough, so backends can do HTTP-01 cert challenges)

What we are doing instead is taking requests on the front-end (whether they come in on port 443 or port 80) and forwarding them to port 443 on the back-end.

The only way I know to accomplish this is Traefik, which we aren't using for now, so instead requests on port 80 front-end get routed to the backend port 443, which responds with an "HTTPS port isn't for HTTP traffic, duh" error 400, meanwhile, requests on port 443 are no problem at all.

I just captured the issue in https://youtu.be/D6qR_jCoPMw?t=3342 (it is an hour long, but very near the end where you can see what happens as the gears are turning trying to understand why I got this error that I already fixed 🤯 and eventually get the right idea)