opensearch-project / helm-charts

:wheel_of_dharma: A community repository for Helm Charts of OpenSearch Project.
https://opensearch.org/docs/latest/opensearch/install/helm/
Apache License 2.0
168 stars 228 forks source link

[BUG][OpenSearch Helm 2.0.1] FailedScheduling : N pod has unbound immediate PersistentVolumeClaims #558

Open YeonghyeonKO opened 2 months ago

YeonghyeonKO commented 2 months ago

Describe the bug

image

NAME                                                   READY   STATUS            RESTARTS   AGE
pod/test-opensearch-helm-dashboards-7f498c4684-lld2g   1/1     Running           0          12m
pod/test-opensearch-helm-master-0                      0/1     PodInitializing   0          12m
pod/test-opensearch-helm-master-1                      0/1     PodInitializing   0          12m
pod/test-opensearch-helm-master-2                      0/1     PodInitializing   0          12m

NAME                                           TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/test-opensearch-helm-dashboards        ClusterIP   172.31.193.248   <none>        5601/TCP            12m
service/test-opensearch-helm-master            ClusterIP   172.31.113.251   <none>        9200/TCP,9300/TCP   12m
service/test-opensearch-helm-master-headless   ClusterIP   None             <none>        9200/TCP,9300/TCP   12m

NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/test-opensearch-helm-dashboards   1/1     1            1           12m

NAME                                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/test-opensearch-helm-dashboards-7f498c4684   1         1         1       12m

NAME                                           READY   AGE
statefulset.apps/test-opensearch-helm-master   0/3     12m

As you can see above, master nodes for OpenSearch Cluster doesn't start as a pod. I am suspicious of the number 27, because the number of kubernetes nodes is exactly 27. The available resource of CPU & Memory of each k8s worker node is enough yet. The logs from each pod(pod/test-opensearch-helm-master-0) are like:

Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  12m   default-scheduler  0/27 nodes are available: 27 pod has unbound immediate PersistentVolumeClaims.
  Warning  FailedScheduling  12m   default-scheduler  0/27 nodes are available: 27 pod has unbound immediate PersistentVolumeClaims.
  Normal   Scheduled         12m   default-scheduler  Successfully assigned test-opensearch-helm/test-opensearch-helm-master-2 to ick8ssrep01w003
  Normal   Pulling           12m   kubelet            Pulling image "docker-repo.xxx.com/hcp-docker/busybox:latest"
  Normal   Pulled            12m   kubelet            Successfully pulled image "docker-repo.xxx.com/hcp-docker/busybox:latest" in 136.527586ms
  Normal   Created           12m   kubelet            Created container fsgroup-volume
  Normal   Started           12m   kubelet            Started container fsgroup-volume
  Normal   Pulled            12m   kubelet            Container image "docker-repo.xxx.com/hcp-docker/opensearchproject/opensearch:2.0.1" already present on machine
  Normal   Created           12m   kubelet            Created container opensearch
  Normal   Started           12m   kubelet            Started container opensearch

To Reproduce Steps to reproduce the behavior: I've tried to run OpenSearch and its dashboard using Helm Chart (v 2.1.0).

/test-opensearch-helm/namespaces.yaml ```yaml apiVersion: v1 kind: Namespace metadata: name: test-opensearch-helm --- apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: xxx-anyuid-hostpath-clusterrole-rolebinding roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: xxx-anyuid-hostpath-psp-clusterrole subjects: - apiGroup: rbac.authorization.k8s.io kind: Group name: system:serviceaccounts:test-opensearch-helm ```
/test-opensearch-helm/kustomization.yaml ```yaml namespace: test-opensearch-helm bases: # - ../../../base/common - ./opensearch/common - ./opensearch/master - ./opensearch-dashboards resources: - namespaces.yaml ```
/test-opensearch-helm/opensearch/master/kustomization.yaml ```yaml helmGlobals: chartHome: ../../../../../base/opensearch/charts helmCharts: - name: opensearch-2.1.0 version: 2.1.0 releaseName: test-opensearch-helm namespace: test-opensearch-helm valuesFile: values.yaml # includeCRDs: true ```
/test-opensearch-helm/opensearch/master/values.yaml ```yaml --- clusterName: "test-opensearch-helm" nodeGroup: "master" # The service that non master groups will try to connect to when joining the cluster # This should be set to clusterName + "-" + nodeGroup for your master group masterService: "test-opensearch-helm-master" # OpenSearch roles that will be applied to this nodeGroup # These will be set as environment variable "node.roles". E.g. node.roles=master,ingest,data,remote_cluster_client roles: - master - ingest - data - remote_cluster_client # - ml replicas: 3 majorVersion: "2" global: # Set if you want to change the default docker registry, e.g. a private one. dockerRegistry: "" # Allows you to add any config files in {{ .Values.opensearchHome }}/config opensearchHome: /usr/share/opensearch # such as opensearch.yml and log4j2.properties config: # Values must be YAML literal style scalar / YAML multiline string. # : | # opensearch.yml: | cluster.name: test-opensearch-helm # Bind to all interfaces because we don't know what IP address Docker will assign to us. network.host: 0.0.0.0 plugins: security: ssl: transport: pemcert_filepath: /usr/share/opensearch/config/certs/opens.pem pemkey_filepath: /usr/share/opensearch/config/certs/opens-key.pem pemtrustedcas_filepath: /usr/share/opensearch/config/certs/root-ca.pem enforce_hostname_verification: false http: enabled: false pemcert_filepath: /usr/share/opensearch/config/certs/opens.pem pemkey_filepath: /usr/share/opensearch/config/certs/opens-key.pem pemtrustedcas_filepath: /usr/share/opensearch/config/certs/root-ca.pem allow_unsafe_democertificates: true allow_default_init_securityindex: true authcz: admin_dn: - CN=kirk,OU=client,O=client,L=test,C=de audit.type: internal_opensearch enable_snapshot_restore_privilege: true check_snapshot_restore_write_privileges: true restapi: roles_enabled: ["all_access", "security_rest_api_access"] system_indices: enabled: true indices: [ ".opendistro-alerting-config", ".opendistro-alerting-alert*", ".opendistro-anomaly-results*", ".opendistro-anomaly-detector*", ".opendistro-anomaly-checkpoints", ".opendistro-anomaly-detection-state", ".opendistro-reports-*", ".opendistro-notifications-*", ".opendistro-notebooks", ".opendistro-asynchronous-search-response*", ] # Extra environment variables to append to this nodeGroup # This will be appended to the current 'env:' key. You can use any of the kubernetes env # syntax here extraEnvs: - name: OPENSEARCH_PASSWORD valueFrom: secretKeyRef: name: opens-credentials key: password - name: OPENSEARCH_USERNAME valueFrom: secretKeyRef: name: opens-credentials key: username - name: DISABLE_INSTALL_DEMO_CONFIG value: "true" # Allows you to load environment variables from kubernetes secret or config map envFrom: [] # - secretRef: # name: env-secret # - configMapRef: # name: config-map # A list of secrets and their paths to mount inside the pod # This is useful for mounting certificates for security and for mounting # the X-Pack license secretMounts: - name: opensearch-cert secretName: opensearch-cert path: /usr/share/opensearch/config/certs defaultMode: 0755 hostAliases: [] # - ip: "127.0.0.1" # hostnames: # - "foo.local" # - "bar.local" image: repository: "docker-repo.xxx.com/hcp-docker/opensearchproject/opensearch" # override image tag, which is .Chart.AppVersion by default tag: "2.0.1" pullPolicy: "IfNotPresent" podAnnotations: {} # iam.amazonaws.com/role: es-cluster # additionals labels labels: {} opensearchJavaOpts: "-Djava.net.preferIPv4Stack=true -Xms8g -Xmx8g -XX:+UnlockDiagnosticVMOptions -Xlog:gc+heap+coops=info" resources: requests: cpu: "0.1" memory: "16Gi" limits: cpu: "4" memory: "16Gi" initResources: limits: cpu: "200m" memory: "50Mi" requests: cpu: "200m" memory: "50Mi" sidecarResources: {} networkHost: "0.0.0.0" rbac: create: true serviceAccountAnnotations: {} serviceAccountName: "" podSecurityPolicy: create: true name: "" spec: privileged: true fsGroup: rule: RunAsAny runAsUser: rule: RunAsAny seLinux: rule: RunAsAny supplementalGroups: rule: RunAsAny volumes: - secret - configMap - persistentVolumeClaim - emptyDir persistence: enabled: true # Set to false to disable the `fsgroup-volume` initContainer that will update permissions on the persistent disk. enableInitChown: true # override image, which is busybox by default image: "docker-repo.xxx.com/hcp-docker/busybox" # override image tag, which is latest by default # imageTag: labels: # Add default labels for the volumeClaimTemplate of the StatefulSet enabled: false # OpenSearch Persistent Volume Storage Class # If defined, storageClassName: # If set to "-", storageClassName: "", which disables dynamic provisioning # If undefined (the default) or set to null, no storageClassName spec is # set, choosing the default provisioner. (gp2 on AWS, standard on # GKE, AWS & OpenStack) # storageClass: "sc-nfs-app-retain" accessModes: - ReadWriteOnce size: 50Gi annotations: {} extraVolumes: [] # - name: extras # emptyDir: {} extraVolumeMounts: [] # - name: extras # mountPath: /usr/share/extras # readOnly: true extraContainers: [] # - name: do-something # image: busybox # command: ['do', 'something'] extraInitContainers: [] # - name: do-somethings # image: busybox # command: ['do', 'something'] # This is the PriorityClass settings as defined in # https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass priorityClassName: "" # By default this will make sure two pods don't end up on the same node # Changing this to a region would allow you to spread pods across regions antiAffinityTopologyKey: "kubernetes.io/hostname" # Hard means that by default pods will only be scheduled if there are enough nodes for them # and that they will never end up on the same node. Setting this to soft will do this "best effort" antiAffinity: "soft" # This is the node affinity settings as defined in # https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity-beta-feature nodeAffinity: {} # This is the pod topology spread constraints # https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/ topologySpreadConstraints: [] # The default is to deploy all pods serially. By setting this to parallel all pods are started at # the same time when bootstrapping the cluster podManagementPolicy: "Parallel" # The environment variables injected by service links are not used, but can lead to slow OpenSearch boot times when # there are many services in the current namespace. # If you experience slow pod startups you probably want to set this to `false`. enableServiceLinks: true protocol: http httpPort: 9200 transportPort: 9300 service: labels: {} labelsHeadless: {} headless: annotations: {} type: ClusterIP nodePort: "" annotations: {} httpPortName: http transportPortName: transport loadBalancerIP: "" loadBalancerSourceRanges: [] externalTrafficPolicy: "" updateStrategy: RollingUpdate # This is the max unavailable setting for the pod disruption budget # The default value of 1 will make sure that kubernetes won't allow more than 1 # of your pods to be unavailable during maintenance maxUnavailable: 1 podSecurityContext: fsGroup: 1000 runAsUser: 1000 securityContext: capabilities: drop: - ALL # readOnlyRootFilesystem: true runAsNonRoot: true runAsUser: 1000 securityConfig: enabled: true path: "/usr/share/opensearch/plugins/opensearch-security/securityconfig" actionGroupsSecret: configSecret: internalUsersSecret: rolesSecret: rolesMappingSecret: tenantsSecret: # The following option simplifies securityConfig by using a single secret and # specifying the config files as keys in the secret instead of creating # different secrets for for each config file. # Note that this is an alternative to the individual secret configuration # above and shouldn't be used if the above secrets are used. config: # There are multiple ways to define the configuration here: # * If you define anything under data, the chart will automatically create # a secret and mount it. # * If you define securityConfigSecret, the chart will assume this secret is # created externally and mount it. # * It is an error to define both data and securityConfigSecret. securityConfigSecret: "" dataComplete: true data: {} # config.yml: |- # internal_users.yml: |- # roles.yml: |- # roles_mapping.yml: |- # action_groups.yml: |- # tenants.yml: |- # How long to wait for opensearch to stop gracefully terminationGracePeriod: 120 sysctlVmMaxMapCount: 262144 readinessProbe: failureThreshold: 3 initialDelaySeconds: 10 periodSeconds: 60 successThreshold: 3 timeoutSeconds: 60 ## Use an alternate scheduler. ## ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/ ## schedulerName: "" imagePullSecrets: [] nodeSelector: worker: "true" tolerations: [] # Enabling this will publically expose your OpenSearch instance. # Only enable this if you have security enabled on your cluster ingress: enabled: true # For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName # See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress ingressClassName: nginx annotations: {} # kubernetes.io/ingress.class: nginx # kubernetes.io/tls-acme: "true" path: / hosts: - test-opensearch-helm.srep01.xxx.com tls: [] # - secretName: chart-example-tls # hosts: # - chart-example.local nameOverride: "" fullnameOverride: "" masterTerminationFix: false lifecycle: # preStop: # exec: # command: ["/bin/sh", "-c", "echo Hello from the postStart handler > /usr/share/message"] # postStart: # exec: # command: # - bash # - -c # - | # #!/bin/bash # # Add a template to adjust number of shards/replicas1 # TEMPLATE_NAME=my_template # INDEX_PATTERN="logstash-*" # SHARD_COUNT=8 # REPLICA_COUNT=1 # ES_URL=http://localhost:9200 # while [[ "$(curl -s -o /dev/null -w '%{http_code}\n' $ES_URL)" != "200" ]]; do sleep 1; done # curl -XPUT "$ES_URL/_template/$TEMPLATE_NAME" -H 'Content-Type: application/json' -d'{"index_patterns":['\""$INDEX_PATTERN"\"'],"settings":{"number_of_shards":'$SHARD_COUNT',"number_of_replicas":'$REPLICA_COUNT'}}' postStart: exec: command: - bash - -c - | #!/bin/bash # Add a template to adjust number of shards/replicas1 ES_URL=http://admin:admin12~!@localhost:9200 while [[ "$(curl -s -o /dev/null -w '%{http_code}\n' $ES_URL)" != "200" ]]; do sleep 1; done # _index_template logs-template-app curl -XPUT "$ES_URL/_index_template/logs-template-app" -H 'Content-Type: application/json' \ -d '{ "index_patterns": [ "app_*", "sys_*" ], "data_stream": { "timestamp_field": { "name": "logTime" } }, "priority": 200, "template": { "settings": { "number_of_shards": 1, "number_of_replicas": 1 } } }' # _index_policy logs-policy-app curl -XDELETE "$ES_URL/_plugins/_ism/policies/logs-policy-app" curl -XPUT "$ES_URL/_plugins/_ism/policies/logs-policy-app" -H 'Content-Type: application/json' \ -d ' { "policy" : { "description" : "A app log of the policy", "default_state" : "hot", "states" : [ { "name" : "hot", "actions" : [ { "retry" : { "count" : 3, "backoff" : "exponential", "delay" : "1m" }, "rollover" : { "min_index_age" : "3m" } } ], "transitions" : [ { "state_name" : "warm", "conditions" : { "min_index_age" : "3m" } } ] }, { "name" : "warm", "actions" : [ { "retry" : { "count" : 3, "backoff" : "exponential", "delay" : "1m" }, "read_only" : { } } ], "transitions" : [ { "state_name" : "delete", "conditions" : { "min_rollover_age" : "3m" } } ] }, { "name" : "delete", "actions" : [ { "retry" : { "count" : 3, "backoff" : "exponential", "delay" : "1m" }, "delete" : { } } ], "transitions" : [ ] } ], "ism_template" : [ { "index_patterns" : [ "app_*", "sys_*" ], "priority" : 0 } ] } } ' keystore: [] # To add secrets to the keystore: # - secretName: opensearch-encryption-key networkPolicy: create: false ## Enable creation of NetworkPolicy resources. Only Ingress traffic is filtered for now. ## In order for a Pod to access OpenSearch, it needs to have the following label: ## {{ template "uname" . }}-client: "true" ## Example for default configuration to access HTTP port: ## opensearch-master-http-client: "true" ## Example for default configuration to access transport port: ## opensearch-master-transport-client: "true" http: enabled: false # Deprecated # please use the above podSecurityContext.fsGroup instead fsGroup: "" ## Set optimal sysctl's. This requires privilege. Can be disabled if ## the system has already been preconfigured. (Ex: https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html) ## Also see: https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/ sysctl: enabled: false ## Enable to add 3rd Party / Custom plugins not offered in the default OpenSearch image. plugins: enabled: false installList: [] # - example-fake-plugin # -- Array of extra K8s manifests to deploy extraObjects: [] ```

Host/Environment (please complete the following information):

Additional context Add any other context about the problem here.

Divyaasm commented 2 months ago

With the above error, are you able to start the cluster using 2.0.1 OS

YeonghyeonKO commented 2 months ago

@Divyaasm Hi, I deployed with Helm Chart for OpenSearch (version: 2.1.0) and using the image for OpenSearch itself(version: 2.0.1) as below:

image:
  repository: "docker-repo.xxx.com/hcp-docker/opensearchproject/opensearch"
  # override image tag, which is .Chart.AppVersion by default
  tag: "2.0.1"
  pullPolicy: "IfNotPresent"
YeonghyeonKO commented 2 months ago

When I first wrote this issue, the number of nodes for Kubernetes cluster was 27.

 Warning  FailedScheduling  12m   default-scheduler  0/27 nodes are available: 27 pod has unbound immediate PersistentVolumeClaims.


Two days ago, new worker node was added to K8s cluster, the logs changed.

 Warning  FailedScheduling  9m   default-scheduler  0/28 nodes are available: 28 pod has unbound immediate PersistentVolumeClaims.