Open mwennrich opened 1 year ago
Very rough idea:
With #308 we can make the firewall survive the shoot migration.
However, as the firewall-controller is now maintaining a seed client for reconciliation, the seed client becomes invalid after a shoot migration. This is because we use a static service account token, which Kubernetes signs with the cluster's CA, which has, of course, changed after the migration. Also the server endpoint has changed after the migration.
Thus, there must be a possibility for the firewall-controller to migrate its client to the new seed. For this, I think we have two options:
If we decide for the second variant, we should also consider migrating away from static service account tokens and instead start rotation of the certificates. Also, we can use bootstrap tokens in order to establish a trusted connection between the firewall-controller and the api-server.
Here is a brief description of how the process could look like:
The firewall gets created with bootstrap kubeconfig through userdata at /etc/firewall-controller/.bootstrap.kubeconfig along with the following roles in the shoot's seed namespace:
---
kind: ClusterRole
metadata:
name: firewall.metal-stack.io:system:firewall-bootstrapper
rules:
- apiGroups:
- certificates.k8s.io
resources:
- certificatesigningrequests
verbs:
- create
- get
- apiGroups:
- certificates.k8s.io
resources:
- certificatesigningrequests/firewallcontroller
verbs:
- create
---
kind: ClusterRoleBinding
metadata:
name: firewall.metal-stack.io:system:firewall-bootstrapper
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: firewall.metal-stack.io:system:firewall-bootstrapper
subjects:
- kind: Group
name: system:bootstrappers
apiGroup: rbac.authorization.k8s.io
---
apiVersion: v1
kind: Secret
metadata:
name: bootstrap-token-07401b
namespace: kube-system
type: bootstrap.kubernetes.io/token
stringData:
description: "Token for bootstrapping the metal-stack firewall-controller."
token-id: 07401b
token-secret: f395accd246ae52d
expiration: <now+60m>
usage-bootstrap-authentication: "true"
usage-bootstrap-signing: "true"
auth-extra-groups: system:bootstrappers
The firewall-controller starts up and uses the bootstrap kubeconfig to issue a certificate signing request (CSR)
The firewall-controller-manager can approve the CSR, enabling the firewall-controller to construct a seed client with the minimal permissions as they currently are implemented.
apiVersion: certificates.k8s.io/v1
kind: CertificateSigningRequest
metadata:
name: firewall-controller-csr
spec:
groups:
- system:authenticated
request: <csr>
signerName: kubernetes.io/kube-apiserver-client
usages:
- digital signature
- key encipherment
- client auth
username: shoot--pcfgbt--cilium-firewall-653f3 <-- FCM creates a rolebinding and role for every firewall
expirationSeconds: <1 year?>
status:
certificate: <cert>
conditions:
- lastTransitionTime: "2023-06-21T10:39:54Z"
lastUpdateTime: "2023-06-21T10:39:54Z"
message: Auto approving firewall-controller client certificate after SubjectAccessReview.
reason: AutoApproved
status: "True"
type: Approved
The firewall-controller writes the seed kubeconfig to /etc/firewall-controller/.seed.kubeconfig
The firewall-controller starts up and uses the shoot access fields from the firewall object to create the shoot client
The shoot client is written to /etc/firewall-controller/.shoot.kubeconfig
The firewall-controller starts up normal operation
The signed certificate for the firewall-controller is continuously checked by the firewall-controller-manager
If the firewall-controller receives an invalid certificate error with the client, it repeats the initial bootstrap process and creates a new seed client
fw2,fwset,fwdeployment objects have a
firewall.metal-stack.io/firewall-controller-manager
finalizer, but fcm has already been deleted.After removing the finalizer, migration continues, but after the restore, a new firewall is created, without deleting the old one. This results in a cluster with two firewalls.