Open hfwen0502 opened 2 years ago
cert-manager 1.6 stopped serving alpha and beta APIs: https://github.com/jetstack/cert-manager/releases/tag/v1.6.0
helm template ... | cmctl convert -f - | kubectl apply -f -
instead of
helm install ...
should work. Please feel free to submit a PR to implement the conversions in the chart (for helm install to work again). We haven't upgraded to cert-manager 1.6 yet on our side, so haven't had an urgent need for the conversion.
@adrienjt Thanks. I also just found out how to get around the helm install issue using "helm template" route. Things seem to be working fine now.
Everything works fine out of the box using the Kubernetes clusters. However, there are quite few things that users need to change in order to get it working on OpenShift (e.g. clusterroles). Now I am facing an issue for the virtual node which represents the workload cluster:
oc describe node admiralty-default-ocp-eu2-1-6198a17ca3
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 0 (0%) 0 (0%)
memory 0 (0%) 0 (0%)
ephemeral-storage 0 (0%) 0 (0%)
Any idea why the resources (cpu/memory) on the virtual node are all 0? I am using the service account to do the authentication between the target and workload clusters. It works fine on K8s but not OpenShift.
I was able to figure out how to set up a kubeconfig secret for OpenShift clusters. Everything works beautifully. Love the tool!
Hi @hfwen0502, I'm glad you were able to figure this out. Would you care to contribute how to set up a kubeconfig secret for OpenShift clusters to the Admiralty documentation? (PR under docs/)
Of course. Would be happy to contribute the documentation. Can the platform be based on the IKS and ROKS services on IBM Cloud? I am working in the hybrid cloud organization in IBM Research. By the way, RBAC needs to be adjusted as well on OpenShift.
oc edit clusterrole admiralty-multicluster-scheduler-source
- apiGroups:
- ""
resources:
- pods
# add the line below
- pods/finalizers
verbs:
- list
# add the line below
- '*'
- apiGroups:
- multicluster.admiralty.io
resources:
- podchaperons
# add the three lines below
- podchaperons/finalizers
- sources
- sources/finalizers
Can the platform be based on the IKS and ROKS services on IBM Cloud?
Yes, no problem.
By the way, RBAC needs to be adjusted as well on OpenShift.
Could you contribute the RBAC changes to the Helm chart?
A PR is submitted which includes both RBAC and doc changes: https://github.com/admiraltyio/admiralty/pull/134
Things only work in the default namespace on OpenShift. There are issues related to scc when we set up Admiralty in the non-default namespace. Errors are shown below:
E0128 20:33:01.214968 1 controller.go:117] error syncing 'hfwen/test-job-hscvc-ms6r7': pods "test-job-hscvc-ms6r7"
is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user
or serviceaccount, provider restricted: .spec.securityContext.fsGroup: Invalid value: []int64{1000720000}: 1000720000 is
not an allowed group, provider restricted: .spec.securityContext.seLinuxOptions.level: Invalid value: "s0:c27,c9": must be s0:c26,c25,
spec.containers[0].securityContext.runAsUser: Invalid value: 1000720000: must be in the ranges: [1000700000, 1000709999],
spec.containers[0].securityContext.seLinuxOptions.level: Invalid value: "s0:c27,c9": must be s0:c26,c25, provider
"ibm-restricted-scc": Forbidden: not usable by user or serviceaccount, provider "nonroot": Forbidden: not usable by
user or serviceaccount, provider "ibm-anyuid-scc": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid":
Forbidden: not usable by user or serviceaccount, provider "ibm-anyuid-hostpath-scc": Forbidden: not usable by user or serviceaccount,
provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden:
not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider
"ibm-anyuid-hostaccess-scc": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by
user or serviceaccount, provider "ibm-privileged-scc": Forbidden: not usable by user or serviceaccount, provider "privileged":
Forbidden: not usable by user or serviceaccount], requeuing```
when we set up Admiralty in the non-default namespace
When Admiralty is installed in the non-default namespace and/or when Sources/Targets are set up (and pods created) in the non-default namespace?
Which SCC are you expecting to apply? restricted
(the only one allowed, but not passing) or something else? If restricted
, have you tried configuring your test job's security context to make it pass the policy? If something else, have you tried allowing the SCC for the pod's service account in that namespace?
@adrienjt Sorry. I should have made myself clear. Admiralty is always installed in the Admiralty namespace. The issue related to SCC occurs when sources/targets are set up in the non-default namespace. Let's assume sources/targets are in the hfwen namespace. In the annotation of the proxy pod at the source, we have the following:
* Source Proxy Pod
Annotations: multicluster.admiralty.io/elect:
multicluster.admiralty.io/sourcepod-manifest:
apiVersion: v1
kind: Pod
spec:
containers:
securityContext:
capabilities:
drop:
- KILL
- MKNOD
- SETGID
- SETUID
runAsUser: 1000690000
securityContext:
fsGroup: 1000680000
seLinuxOptions:
level: s0:c26,c15
On the target cluster, the PodChaperon object has this:
oc get podchaperons hf1-job-tvlrx-p7sp2 -o yaml
apiVersion: multicluster.admiralty.io/v1alpha1
kind: PodChaperon
spec:
containers:
securityContext:
capabilities:
drop:
- KILL
- MKNOD
- SETGID
- SETUID
runAsUser: 1000680000
securityContext:
fsGroup: 1000680000
seLinuxOptions:
level: s0:c26,c15
This is a problem because the target cluster actually expects SCC in the hfwen namespace with the following:
securityContext:
fsGroup: 1000780000 <= something with the range [1000700000, 1000709999]
seLinuxOptions:
level: s0:c26,c15 <= should be s0:c26,c25
Any idea how to resolve this? When sources/targets are in the default namespace, the securityContext stays empty. That's why we did not hit this problem. I have also tried to adjust the SCC in the service account, which did not work.
On OpenShift, it always comes with 3 service accounts by default.
NAME SECRETS AGE
builder 2 44m
default 2 44m
deployer 2 44m
Adding the privileged SCC to the default sa in my hfwen namespace (both source and target) seems to fix the SCC issue.
oc adm policy add-scc-to-user privileged -z default -n hfwen
@adrienjt Is this something you have in mind? Is this a good practice or the only way to resolve it?
ok. Find a better solution. The OpenShift clusters on IBM Cloud come with other preconfigured SCC groups. We can use a less-privileged one instead of privileged.
I am trying to explore the capabilities that Admiralty can offer in the OCP cluster provisioned on IBM Cloud. Below is the info. about the OCP cluster and the cert-manager version installed here:
However, when trying to install Admiralty, I encountered issues shown below:
Any idea how to fix this?