Allow for configuring the webhook deployment from global DWOC; increase webhook replica count to 2

What does this PR do?

Adds a new DWOC field for configuring the DWO webhook server: config.webhook. '
- 3 configuration options are provided for specifying the webhook server replica count, webhook server pod tolerations, and webhook server nodeSelectors
Sets the default webhook server replica count to 2.

Since the webhook server is used for all devworkspaces, its configuration options only take effect when they are specified in the global DWOC.

Additionally, since the devworkspace-controller-manager is responsible for creating the webhook deployment, the devworkspace-controller-manager pod must be terminated (and automatically re-created by the deployment) for changes to the webhook configuration to take effect.

What issues does this PR fix or reference?

Fixes https://github.com/devfile/devworkspace-operator/issues/1272

Is it tested? How?

I recommend following the testing steps below in order, as they were written with this assumption in mind.

To set up for testing, you'll need a multi-node cluster. IIRC, requesting a gcp cluster from cluster bot should provide a multi-node cluster (e.g. launch 4.16 gcp). Minikube can be configured to have multiple nodes with minikube start --nodes <node-count>, e.g. minikube start --nodes 4 && minikube addons enable ingress.

I've pushed a build of DWO with the changes from this PR to quay.io/aobuchow/devworkspace-controller:configurable-webhook for ease of testing.

Once you have your multi-node cluster running with DWO installed, retrieve the list of nodes on the cluster with kubectl get nodes:

NAME           STATUS   ROLES           AGE     VERSION  
minikube       Ready    control-plane   8m10s   v1.30.0  
minikube-m02   Ready    <none>          7m49s   v1.30.0  
minikube-m03   Ready    <none>          7m36s   v1.30.0  
minikube-m04   Ready    <none>          7m23s   v1.30.0

Verifying nodeSelector

Verify which node the devworkspace-webhook-server is currently running on: Do a kubectl get pod -n $NAMESPACE to find the webhook pod names. Then a kubectl get pod devworkspace-webhook-server... -n $NAMESPACE -o jsonpath='{.spec.nodeName}' for each webhook pod. In my case, the pods were scheduled onto nodes minikube-m03 and minikube-m04
Add a label to the node which we want the webhook to be deployed: kubectl patch node <node-name> --type='merge' --patch '{"metadata": {"labels": {"my-label": "my-value"}}}'

Modify the webhook configuration in the global DWOC to add a nodeSelector corresponding to the node label we just added: kubectl edit dwoc -n $NAMESPACE

apiVersion: controller.devfile.io/v1alpha1  
config:  
 routing:  
   clusterHostSuffix: 192.168.49.2.nip.io  
   defaultRoutingClass: basic  
+ webhook:  
+   nodeSelector:  
+     my-label: my-value  
   replicas: 2  
 workspace:  
   imagePullPolicy: Always  
kind: DevWorkspaceOperatorConfig

Terminate the devworkspace-controller-manager pod so that it modifies the webhook deployment based on the new webhook configuration in the DWOC: kubectl delete pod devworkspace-controller-manager-... -n $NAMESPACE
Wait for the old webhook pods to terminate and for the new pods to start up successfully
Verify that the new webhook pods were scheduled on the correct node which had your label applied: kubectl get pod devworkspace-webhook-server... -n $NAMESPACE -o jsonpath='{.spec.nodeName}' for each webhook pod.

Verifying tolerations

Taint the node that you applied a label to in the previous step: kubectl taint nodes <name-of-node-with-label> key1=value1:NoExecute. All pods running on the tainted node will be evicted since we applied the NoExecute taint.

The webhook deployment will create pods scheduled onto other available/non-tainted nodes to fulfill the desired number of webhook replicas. However, since we have a nodeSelector targeting the tainted node, an additional webhook-server pod will remain in a pending state as it cannot be scheduled onto the tainted node.

Modify DWOC to add a toleration that will allow the webhook server to be scheduled on the tainted node, and kill the devworkspace-controller-manager pod to modify webhook deployment:

apiVersion: controller.devfile.io/v1alpha1  
config:  
 routing:  
   clusterHostSuffix: 192.168.49.2.nip.io  
   defaultRoutingClass: basic  
 webhook:  
   nodeSelector:  
     my-label: my-value  
   replicas: 2  
+   tolerations:  
+   - effect: NoExecute  
+     key: key1  
+     operator: Equal  
+     value: value1  
 workspace:  
   imagePullPolicy: Always  
kind: DevWorkspaceOperatorConfig

You should see the webhook server pod that was previously in a pending state enter the running state. The 2 other webhook server replica pods will terminate and once will get recreated so that they are scheduled on the node with desired nodeSelector. Afterwards, there will only be 2 webhook server pods remaining on the cluster, and they should be running on the desired node.

Verifying replicas

Modify the DWOC to increase the number of webhook server replicas:

apiVersion: controller.devfile.io/v1alpha1  
config:  
 routing:  
   clusterHostSuffix: 192.168.49.2.nip.io  
   defaultRoutingClass: basic  
 webhook:  
+   replicas: 4  
 workspace:  
   imagePullPolicy: Always  
kind: DevWorkspaceOperatorConfig

Kill the devworkspace-controller-manager pod to have the devworkspace webhook server deployment updated.
Ensure the devworkspace webhook server deployment has the correct number of replicas: `kubectl get deployment devworkspace-webhook-server -n $NAMESPACE -o jsonpath='{.spec.replicas}'
Optional: try setting the number of webhook server replicas to 0 or a negative number. The CR validation should fail and prevent you from making the edit.

Config logging

When the DWOC webhook's configuration contains nodeSelectors and tolerations, the output resembles the following:

Updated config to [routing.clusterHostSuffix=192.168.49.2.nip.io,webhook.nodeSelectors=[my-label=my-value, my-label2=my-value2],webhook.tolerations=[&Toleration{Key:key1,Operator:Equal,Value:value1,Effect:NoExecute,TolerationSeconds:nil,}, &Toleration{Key:key2,Operator:Equal,Value:value2,Effect:NoExecute,TolerationSeconds:nil,}],enableExperimentalFeatures=true]

The formatting for Tolerations is a bit awkward but using the Kubernetes implementation of String() seems sufficient, rather than re-implementing it.

PR Checklist

[ ] E2E tests pass (when PR is ready, comment /test v8-devworkspace-operator-e2e, v8-che-happy-path to trigger)
- [ ] v8-devworkspace-operator-e2e: DevWorkspace e2e test
- [ ] v8-che-happy-path: Happy path for verification integration with Che

devfile / devworkspace-operator