dshamanthreddy commented 1 year ago

Hello @adityajoshi12

My Env details

EKS Cluster version : 1.24 Cluster on AWS EKS Nodegroup : 3 Instance family : t2.medium

Was trying to install via helm helm repo add elastic https://helm.elastic.co helm install elasticsearch elastic/elasticsearch --set replicas=1

`kubectl describe pod elasticsearch-master-0 Name: elasticsearch-master-0 Namespace: default Priority: 0 Service Account: default Node: Labels: app=elasticsearch-master chart=elasticsearch controller-revision-hash=elasticsearch-master-6bfccdfb68 release=elasticsearch statefulset.kubernetes.io/pod-name=elasticsearch-master-0 Annotations: kubernetes.io/psp: eks.privileged Status: Pending IP: IPs: Controlled By: StatefulSet/elasticsearch-master Init Containers: configure-sysctl: Image: docker.elastic.co/elasticsearch/elasticsearch:8.5.1 Port: Host Port: Command: sysctl -w vm.max_map_count=262144 Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lz5sz (ro) Containers: elasticsearch: Image: docker.elastic.co/elasticsearch/elasticsearch:8.5.1 Ports: 9200/TCP, 9300/TCP Host Ports: 0/TCP, 0/TCP Limits: cpu: 1 memory: 2Gi Requests: cpu: 1 memory: 2Gi Readiness: exec [bash -c set -e

Exit if ELASTIC_PASSWORD in unset

if [ -z "${ELASTIC_PASSWORD}" ]; then echo "ELASTIC_PASSWORD variable is missing, exiting" exit 1 fi

If the node is starting up wait for the cluster to be ready (request params: "wait_for_status=green&timeout=1s" )

Once it has started only check that the node itself is responding

START_FILE=/tmp/.es_start_file

Disable nss cache to avoid filling dentry cache when calling curl

This is required with Elasticsearch Docker using nss < 3.52

export NSS_SDB_USE_CACHE=no

http () { local path="${1}" local args="${2}" set -- -XGET -s

if [ "$args" != "" ]; then set -- "$@" $args fi

set -- "$@" -u "elastic:${ELASTIC_PASSWORD}"

curl --output /dev/null -k "$@" "https://127.0.0.1:9200${path}" }

if [ -f "${START_FILE}" ]; then echo 'Elasticsearch is already running, lets check the node is healthy' HTTP_CODE=$(http "/" "-w %{http_code}") RC=$? if [[ ${RC} -ne 0 ]]; then echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} https://127.0.0.1:9200/ failed with RC ${RC}" exit ${RC} fi

ready if HTTP code 200, 503 is tolerable if ES version is 6.x

if [[ ${HTTP_CODE} == "200" ]]; then exit 0 elif [[ ${HTTP_CODE} == "503" && "8" == "6" ]]; then exit 0 else echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} https://127.0.0.1:9200/ failed with HTTP code ${HTTP_CODE}" exit 1 fi

else echo 'Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=1s" )' if http "/_cluster/health?wait_for_status=green&timeout=1s" "--fail" ; then touch ${START_FILE} exit 0 else echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )' exit 1 fi fi ] delay=10s timeout=5s period=10s #success=3 #failure=3 Environment: node.name: elasticsearch-master-0 (v1:metadata.name) cluster.initial_master_nodes: elasticsearch-master-0, node.roles: master,data,data_content,data_hot,data_warm,data_cold,ingest,ml,remote_cluster_client,transform, discovery.seed_hosts: elasticsearch-master-headless cluster.name: elasticsearch network.host: 0.0.0.0 ELASTIC_PASSWORD: <set to the key 'password' in secret 'elasticsearch-master-credentials'> Optional: false xpack.security.enabled: true xpack.security.transport.ssl.enabled: true xpack.security.http.ssl.enabled: true xpack.security.transport.ssl.verification_mode: certificate xpack.security.transport.ssl.key: /usr/share/elasticsearch/config/certs/tls.key xpack.security.transport.ssl.certificate: /usr/share/elasticsearch/config/certs/tls.crt xpack.security.transport.ssl.certificate_authorities: /usr/share/elasticsearch/config/certs/ca.crt xpack.security.http.ssl.key: /usr/share/elasticsearch/config/certs/tls.key xpack.security.http.ssl.certificate: /usr/share/elasticsearch/config/certs/tls.crt xpack.security.http.ssl.certificate_authorities: /usr/share/elasticsearch/config/certs/ca.crt Mounts: /usr/share/elasticsearch/config/certs from elasticsearch-certs (ro) /usr/share/elasticsearch/data from elasticsearch-master (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lz5sz (ro) Volumes: elasticsearch-master: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: elasticsearch-master-elasticsearch-master-0 ReadOnly: false elasticsearch-certs: Type: Secret (a volume populated by a Secret) SecretName: elasticsearch-master-certs Optional: false kube-api-access-lz5sz: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: Burstable Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message

Warning FailedScheduling 6m7s default-scheduler running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition`

the pods are in pending state When I describe the pod Error : running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition

When i run jaeger , the collector, query, both pod are Crashing observability jaeger-operator-fddbfdbcf-rztrq 2/2 Running 0 2m44s observability jaeger-prod-collector-568d654d-5h926 0/1 CrashLoopBackOff 4 (29s ago) 2m20s observability jaeger-prod-query-69fbfbb9c8-l2jzf 1/2 CrashLoopBackOff 4 (22s ago) 2m11s opentelemetry-operator-system opentelemetry-operator-controller-manager-8d57b497f-4k9gx 2/2 Running 0 4m46s

dshamanthreddy commented 1 year ago

If you are using helm to install ES, what is repo url ?

adityajoshi12 commented 1 year ago

can you share the logs of the CrashLoopBackOff pods ?

dshamanthreddy commented 1 year ago

Hey @adityajoshi12 I can't get logs

% kubectl logs elasticsearch-master-0 Defaulted container "elasticsearch" out of: elasticsearch, configure-sysctl (init)

When i describe Events: Type Reason Age From Message

Warning FailedScheduling 31s default-scheduler running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition

When I check pvc kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE elasticsearch-master-elasticsearch-master-0 Pending gp2 21m

kubectl describe pvc elasticsearch-master-elasticsearch-master-0

Events: Type Reason Age From Message

Normal WaitForFirstConsumer 40m persistentvolume-controller waiting for first consumer to be created before binding Warning ProvisioningFailed 11m ebs.csi.aws.com_ebs-csi-controller-884c7f59f-xfctr_eec96684-2a44-4f7a-9932-2b24ddcb09e8 failed to provision volume with StorageClass "gp2": rpc error: code = Internal desc = Could not create volume "pvc-49cb6ae6-0670-4c8f-a1ef-e71204cc3dac": could not create volume in EC2: UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: 4uZW7Y2k_eV6kmx7THDKiv7ukriKsutDROLZiP8cY-Xy4pg-N_kqXwT2BYlyoKu3OHBGYOHqYA7o7rwgoT3M0yI2AQROiS9U4NZKGk8Iv3L_CgoN_ZUVaiZN1uURaQqTD0POabuSCNwjoxc3SSFdKZZpyOLJcHMCCdspSkvwKnQLSKVpY63W5p53pWtGJkB-2R-HJ9GZIL1kcug-q6WC1uc454mnpmz-oOs1nfK4G09MLD0Lm1ukr9YddWQ4CslBYn8TVPy-lEAsHFl0N-L_KlYNHHkZoHI_ERHsMB_PQVQLbjRVXcuP7mHp_EyyX7OXRhX6qAPyZr_1gLvuGajXJVU4inqaw2o9RPxPIBab3Au1n5b2UPfNDklo7JeIPxCCsX6jl6wjEh3o44qsA0HvxAeiEY0WCy2047MjMWIHaIR0XTHgv5rYFFS_WxPyDg379rrsfbAmCmdbTcAXfoHx1sM3qsQCD1c9HGyf9IzFOM826YlqiSdyTh_9HaJAkpNn9_ca4MhVDTdb2yLfm7VxxMY3tgSZIURMBRhc9QM_XigGsam6F1j2RoED0G1AeeFNyQhHN6gJW601azE5deWmpIhWu22XuOC67JKbr8zFSj7jFlckM8uYbactXBBywr_RotQfWa5ovFCNcmOgvkaT7I7huZ5g status code: 403, request id: cc48f53f-b6d7-41fb-90c2-0d5b6b669842 Warning ProvisioningFailed 11m ebs.csi.aws.com_ebs-csi-controller-884c7f59f-xfctr_eec96684-2a44-4f7a-9932-2b24ddcb09e8 failed to provision volume with StorageClass "gp2": rpc error: code = Internal desc = Could not create volume "pvc-49cb6ae6-0670-4c8f-a1ef-e71204cc3dac": could not create volume in EC2: UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: LbWlyTtjrE5JsUNZrxURE4FujrJdKGZ_nhOaiaYLeZnP2yiWlbVZ-Zhw3bSCO1-dVeSAhLZgFEH_E4uKBPFiOhpDQ6XA9K-ynr0J7_OeipuXZ_g0vmTmbDTFa8kwfBBJYr9UIGEkjn-oBuwVUXRCXtIKkmtyJd8OC9141Oqrza2KtB-FjAscK4h2lEPZhpZvhPRXUlnf1LgIfMSoApyyAByGfPR8hsGvuhtZgL-Qi1FQelVwZx8rNbmCSpfvWPb0ZxkLXZHhrS5jcmzBFpbTVkWIMt7De-t911EhJ7eYBlnCGEi6MsOL00sGHJkBlwm5wgWa-iKKdgQ0FKp4tNo6CkCtizkCGmEgfwaIefxlorWswIPRaGzQup6zzhEl4H2iQlqYmTk9c38bL7IXW8ysxB4HJ42Mi3wOD2bnIUurVI0iuBuUHRigpTXh9QxZ0b7PY6t_An876dpWRb4JFzWyqL8C4EMHWiojtKVwLRofSN9ryophMeFePiWdKjmwppPv8gUFpWPry9ZD9p2IZ0-Te_El2R3sYrH68Lz6XjKa_lc8FZxQ_GRjQ2g7Pbhl3016myyOxKIKzGhrUpddPulth_ntXE9CKcsEw-H9duC_HOQaTQFavZQOWjXc-rpK-RHskLuXKvxAscfds532NOh_iCGxhyccnA status code: 403, request id: 1b830a25-9f50-4a57-9b0f-1e9963912c15 Warning ProvisioningFailed 11m ebs.csi.aws.com_ebs-csi-controller-884c7f59f-xfctr_eec96684-2a44-4f7a-9932-2b24ddcb09e8 failed to provision volume with StorageClass "gp2": rpc error: code = Internal desc = Could not create volume "pvc-49cb6ae6-0670-4c8f-a1ef-e71204cc3dac": could not create volume in EC2: UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: Vs3F3kocaEtejoASxVHvBdfXkEUJZH2O7KAZ6vqHGuFx4bLhhVoqAVMTqZNULNOdFq45787OwyMZ58QVHBfM5Z1sAOPpQ8x3zRgcEtre1zOvmJm5Oubjh5piGha2y6OXnI_51nZkCUXbVeI8W56S-yVJP3KFVjhdzJrEMucaEz7e61u-Frgo_SzoTg-s0o3x6n6bgCegbXuk-G12YFk1hSqaPqMl3ZxqmSAb-3SYSfm5ZJhOwg6Z85TZZ_0E30zRCGLOfKMKiY13Klu5kd-GafUg3qniCVcYRpECKb1GuxN4L8RVQhgPQkVTzLKkNw7qOgD25LeMhWzpzkSS3h5hC0xGHzqD9egpYvI3jqE5InK15BQ3ih-dLMNDpvLL_LOGBrm-ztp7Q59qplI5pcdioTK0RFfKs6jtVqPcTHcWvUvFK14Am0Phai56qbyAIM8lIhg2WNJiVVVpH6HalUaZgoHxZLiUOBAX49eq8QZz0Fa0mhRFZcjQkuhhs40MI_AS3r_nXMOkGXFxVt-S6kAw2MhyWVkysgqrArxaPMKewTXgbvgGPFNRMnN2UNbircrmuT5hWrRP-RRnjcSUQuJMzfEU0h5aiimfQwzRq-NPQwal9F6gqUeJ6ZMFofxsGacws2tqmmFklG6-F5ulHGxcM91Vyr3NmQ status code: 403, request id: 44c62642-4ee6-42ac-9dd4-35cf0db264cd

User2707 commented 8 months ago

Hii @dshamanthreddy

I have the same issue of volume mount pending Could you be able to solve the issue ?

adityajoshi12 / opentelemetry-samples

Elastic search Pod are pending on EKS cluster #1

Exit if ELASTIC_PASSWORD in unset

If the node is starting up wait for the cluster to be ready (request params: "wait_for_status=green&timeout=1s" )

Once it has started only check that the node itself is responding

Disable nss cache to avoid filling dentry cache when calling curl

This is required with Elasticsearch Docker using nss < 3.52

ready if HTTP code 200, 503 is tolerable if ES version is 6.x