Closed wallrj closed 1 year ago
Name | Link |
---|---|
Latest commit | 7b22cba1d9d3f94aaf0f3af0befdf1c24e563e74 |
Latest deploy log | https://app.netlify.com/sites/cert-manager-website/deploys/6538dd7f64c9fc00080f21e5 |
Deploy Preview | https://deploy-preview-1331--cert-manager-website.netlify.app |
Preview on mobile | Toggle QR Code...Use your smartphone camera to open QR code link. |
To edit notification comments on pull requests, go to your Netlify site configuration.
We simply use https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector to schedule cert-manager workloads to dedicated platform nodes. I think that should be included as the simplest way of achieving the desired goal. The examples you have put up are simpler to express with
nodeselector
, I think?
Done. I agree that nodeSelector is much simpler and works the same as nodeAffinity so I've changed it. I had to explain that there is a default OS nodeSelector which you must explicitly add to your values.
I am not an expert in this area, but why do we need tolerations in addition to the nodeSelector? Looks like a duplication to me, and according to https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector, just the nodeSelector should be sufficient. And this setup works well for us.
I'm not an expert either, but I guess one reason is:
nodeSelector
or affinity.nodeAffinity
settings,How do you prevent this happening in your cluster?
Do you have an admission webhook that overrides nodeSelector
of the Pods of your tenants?
I will make it clearer that there are various solutions to this problem and that this is only one suggestion.
Great contribution, @wallrj! The documentation sounds correct. A minor remark, I do not recognise cainjection
and startupapicheck
used in the example, but it's been a long time since I used this package.
@erikgb: adding LGTM is restricted to approvers and reviewers in OWNERS files.
How do you prevent this happening in your cluster? Do you have an admission webhook that overrides
nodeSelector
of the Pods of your tenants?
No idea, but I can check. 😉 We run on Openshift, and it's usually pretty "secure by default".
How do you prevent this happening in your cluster? Do you have an admission webhook that overrides
nodeSelector
of the Pods of your tenants?No idea, but I can check. 😉 We run on Openshift, and it's usually pretty "secure by default".
I think this is handled by some Openshift "magic" described here. When a normal user, without write access to namespace resources, schedules a workload, the pod will always get a "worker" label added to the pod nodeSelector. Since none of our nodes matches worker + something else, it means that all end-user workloads will be scheduled on worker nodes. Or not scheduled at all - if a user tries to set nodeSelector
. Our platform team can override this default behavior by annotating system namespaces - like cert-manager.
I think this is handled by some Openshift "magic" described here.
@erikgb Thanks so much for digging into that! I've added a link to that doc.
/hold cancel
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: erikgb, hawksight, SgtCoDFish
The full list of commands accepted by this bot can be found here.
The pull request process is described here
Preview: https://deploy-preview-1331--cert-manager-website.netlify.app/docs/installation/best-practice/#isolate-cert-manager-on-dedicated-node-pools
Followup to #1330 Fixes: https://github.com/cert-manager/cert-manager/issues/5211
In this PR I want to give an example of how to use the affinity and toleration Helm values and I propose running the cert-manager Pods on dedicated "platform" nodes, for security reasons, but there may be other good use cases.
@erikgb Please take a look