node1 ansible_host=10.10.10.0
## List all the kube nodes that will form the GCS cluster
## Ensure that their hostnames are correct
node2 gcs_disks='["/dev/sdc"]'
node3 gcs_disks='["/dev/sdc"]'
node4 gcs_disks='["/dev/sdc"]'
[kube-master]
node1
[gcs-node]
node2
node3
node4
A path to kubectl command in deploy-gcs.yml playbook is incorrect
Gcs-node instead of kube-node in deploy-gcs playbook for deploying gd2 pods because in examples/inventory-gcs-only.example file there is no kube-node tag.
- until: peers_resp.status is defined and (peers_resp.status == 200 and peers_resp.json|length == groups['kube-node']|length)
+ until: peers_resp.status is defined and (peers_resp.status == 200 and peers_resp.json|length == groups['gcs-node']|length)
[GCS | GD2 Cluster | Wait for glusterd2-cluster to become ready] gets stuck.
I checked oc get events -n gcs and it gives below errors.
[GCS | GD2 Cluster | Wait for glusterd2-cluster to become ready] fails because of
gluster-node2-0 in StatefulSet gluster-node2 failed error: pods "gluster-node2-0" is forbidden: unable to validate against any security context constraint: [spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[3]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[4]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[5]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed]
10s 5m 26 gluster-node3.15741a9c47da5b39 StatefulSet Warning FailedCreate statefulset-controller create Pod gluster-node3-0 in StatefulSet gluster-node3 failed error: pods "gluster-node3-0" is forbidden: unable to validate against any security context constraint: [spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[3]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[4]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[5]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed]
10s 5m 26 gluster-node4.15741a9c6a1e22bd StatefulSet Warning FailedCreate statefulset-controller create Pod gluster-node4-0 in StatefulSet gluster-node4 failed error: pods "gluster-node4-0" is forbidden: unable to validate against any security context constraint: [spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[3]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[4]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[5]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed]
Reason:
This OpenShift cluster has security context constraint policies enabled that forbid any pod, without the explicitly set policy for the service account, to be allocated. So that's why it works in k8s, not in OCP
TASK [GCS | GD2 Cluster | Wait for glusterd2-cluster to become ready] is failing.
Error:
0s 47s 24 gluster-node2-0.15741cf057ee0f6e Pod Warning FailedScheduling default-scheduler 0/4 nodes are available: 4 node(s) didn't match node selector.
[root@node1 ~]# oc get nodes
NAME STATUS ROLES AGE VERSION
node1 Ready master 14d v1.11.0+d4cacc0
node2 Ready infra 14d v1.11.0+d4cacc0
node3 Ready compute 14d v1.11.0+d4cacc0
node4 Ready compute 14d v1.11.0+d4cacc0
Gd2 pods expect node to be compute or master it doesn’t deploy on infra node. Initially, I had 1-master, 1-infra and 2-compute. So it expects gcs nodes to be compute or master.
[root@node1 deploy]# oc get nodes
NAME STATUS ROLES AGE VERSION
node1 Ready master 14d v1.11.0+d4cacc0
node2 Ready compute,infra 14d v1.11.0+d4cacc0
node3 Ready compute 14d v1.11.0+d4cacc0
node4 Ready compute 14d v1.11.0+d4cacc0
After the change is done:
Result :
0s 0s 1 gluster-node2-0.15741d62c298d738 Pod spec.containers{glusterd2} Normal Pulled kubelet, node2 Successfully pulled image "docker.io/gluster/glusterd2-nightly"
TASK [GCS | CSI Driver | Wait for csi-provisioner to become available]
Error: fatal: [node1]: FAILED! => {
"msg": "The conditional check 'result.stdout|int == groups['kube-node']|length' failed. The error was: error while evaluating conditional (result.stdout|int == groups['kube-node']|length): 'dict object' has no attribute 'kube-node'"
Solution:
vi deploy-gcs.yml
@@ -185,7 +185,7 @@
- name: GCS | CSI Driver | Wait for csi-nodeplugin to become available
command: "{{ kubectl }} -n{{ gcs_namespace }} -ojsonpath={.status.numberAvailable} get daemonset csi-nodeplugin-glusterfsplugin"
register: result
- until: result.stdout|int == groups['kube-node']|length
+ until: result.stdout|int == groups['gcs-node']|length
delay: 10
retries: 50
6.TASK [GCS | CSI Driver | Wait for csi-nodeplugin to become available]
Error:
oc get events -n gcs
0s 21s 13 csi-nodeplugin-glusterfsplugin.15741eb52fa45844 DaemonSet Warning FailedCreate daemonset-controller Error creating: pods "csi-nodeplugin-glusterfsplugin-" is forbidden: unable to validate against any security context constraint: [spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.containers[1].securityContext.privileged: Invalid value: true: Privileged containers are not allowed capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added]
0s 41s 14 csi-nodeplugin-glusterfsplugin.15741eb52fa45844 DaemonSet Warning FailedCreate daemonset-controller Error creating: pods "csi-nodeplugin-glusterfsplugin-" is forbidden: unable to validate against any security context constraint: [spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.containers[1].securityContext.privileged: Invalid value: true: Privileged containers are not allowed capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added]
0s 1m 15 csi-nodeplugin-glusterfsplugin.15741eb52fa45844 DaemonSet Warning FailedCreate daemonset-controller Error creating: pods "csi-nodeplugin-glusterfsplugin-" is forbidden: unable to validate against any security context constraint: [spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.containers[1].securityContext.privileged: Invalid value: true: Privileged containers are not allowed capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added]
0s 2m 16 csi-nodeplugin-glusterfsplugin.15741eb52fa45844 DaemonSet Warning FailedCreate daemonset-controller Error creating: pods "csi-nodeplugin-glusterfsplugin-" is forbidden: unable to validate against any security context constraint: [spec.volumes[0]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[1]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.volumes[2]: Invalid value: "hostPath": hostPath volumes are not allowed to be used spec.containers[1].securityContext.privileged: Invalid value: true: Privileged containers are not allowed capabilities.add: Invalid value: "SYS_ADMIN": capability may not be added]
Reason:
We are creating the csi-nodeplugin but not adding it to the privileged user.
GCS deployment on OpenShift
Prerequisites: Deployed on Openshift 3.11 - 4 node setup(1-master, 3- compute, 1-infra)
Inventory File for GCS used
I checked
oc get events -n gcs
and it gives below errors.Reason: This OpenShift cluster has security context constraint policies enabled that forbid any pod, without the explicitly set policy for the service account, to be allocated. So that's why it works in k8s, not in OCP
Solution :
and then add the following to the pod declaration: vi templates/gcs-manifests/gcs-gd2.yml.j2 serviceAccountName: gd2
Example
Gd2 pods expect node to be compute or master it doesn’t deploy on infra node. Initially, I had 1-master, 1-infra and 2-compute. So it expects gcs nodes to be compute or master.
Solution:
Result:
After the change is done:
Result : 0s 0s 1 gluster-node2-0.15741d62c298d738 Pod spec.containers{glusterd2} Normal Pulled kubelet, node2 Successfully pulled image "docker.io/gluster/glusterd2-nightly"
Error: fatal: [node1]: FAILED! => { "msg": "The conditional check 'result.stdout|int == groups['kube-node']|length' failed. The error was: error while evaluating conditional (result.stdout|int == groups['kube-node']|length): 'dict object' has no attribute 'kube-node'"
Solution:
vi deploy-gcs.yml
6.TASK [GCS | CSI Driver | Wait for csi-nodeplugin to become available]
Error:
oc get events -n gcs
Reason: We are creating the csi-nodeplugin but not adding it to the privileged user.
Solution:
oc adm policy add-scc-to-user privileged -ngcs -z csi-nodeplugin
Error:
12s 1m 15 prometheus-operator-c4b75f7cd.15741ff38e3a2530 ReplicaSet Warning FailedCreate replicaset-controller Error creating: pods "prometheus-operator-c4b75f7cd-" is forbidden: unable to validate against any security context constraint: [spec.containers[0].securityContext.securityContext.runAsUser: Invalid value: 65534: must be in the ranges: [1000340000, 1000349999]]
Solution:oc adm policy add-scc-to-user privileged -nmonitoring -z prometheus-operator