adityajoshi12 / opentelemetry-samples

2 stars 2 forks source link

Elastic search Pod are pending on EKS cluster #1

Open dshamanthreddy opened 1 year ago

dshamanthreddy commented 1 year ago

Hello @adityajoshi12

My Env details

EKS Cluster version : 1.24 Cluster on AWS EKS Nodegroup : 3 Instance family : t2.medium

Was trying to install via helm helm repo add elastic https://helm.elastic.co helm install elasticsearch elastic/elasticsearch --set replicas=1

`kubectl describe pod elasticsearch-master-0 Name: elasticsearch-master-0 Namespace: default Priority: 0 Service Account: default Node: Labels: app=elasticsearch-master chart=elasticsearch controller-revision-hash=elasticsearch-master-6bfccdfb68 release=elasticsearch statefulset.kubernetes.io/pod-name=elasticsearch-master-0 Annotations: kubernetes.io/psp: eks.privileged Status: Pending IP: IPs: Controlled By: StatefulSet/elasticsearch-master Init Containers: configure-sysctl: Image: docker.elastic.co/elasticsearch/elasticsearch:8.5.1 Port: Host Port: Command: sysctl -w vm.max_map_count=262144 Environment: Mounts: /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lz5sz (ro) Containers: elasticsearch: Image: docker.elastic.co/elasticsearch/elasticsearch:8.5.1 Ports: 9200/TCP, 9300/TCP Host Ports: 0/TCP, 0/TCP Limits: cpu: 1 memory: 2Gi Requests: cpu: 1 memory: 2Gi Readiness: exec [bash -c set -e

Exit if ELASTIC_PASSWORD in unset

if [ -z "${ELASTIC_PASSWORD}" ]; then echo "ELASTIC_PASSWORD variable is missing, exiting" exit 1 fi

If the node is starting up wait for the cluster to be ready (request params: "wait_for_status=green&timeout=1s" )

Once it has started only check that the node itself is responding

START_FILE=/tmp/.es_start_file

Disable nss cache to avoid filling dentry cache when calling curl

This is required with Elasticsearch Docker using nss < 3.52

export NSS_SDB_USE_CACHE=no

http () { local path="${1}" local args="${2}" set -- -XGET -s

if [ "$args" != "" ]; then set -- "$@" $args fi

set -- "$@" -u "elastic:${ELASTIC_PASSWORD}"

curl --output /dev/null -k "$@" "https://127.0.0.1:9200${path}" }

if [ -f "${START_FILE}" ]; then echo 'Elasticsearch is already running, lets check the node is healthy' HTTP_CODE=$(http "/" "-w %{http_code}") RC=$? if [[ ${RC} -ne 0 ]]; then echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} https://127.0.0.1:9200/ failed with RC ${RC}" exit ${RC} fi

ready if HTTP code 200, 503 is tolerable if ES version is 6.x

if [[ ${HTTP_CODE} == "200" ]]; then exit 0 elif [[ ${HTTP_CODE} == "503" && "8" == "6" ]]; then exit 0 else echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} https://127.0.0.1:9200/ failed with HTTP code ${HTTP_CODE}" exit 1 fi

else echo 'Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=1s" )' if http "/_cluster/health?wait_for_status=green&timeout=1s" "--fail" ; then touch ${START_FILE} exit 0 else echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )' exit 1 fi fi ] delay=10s timeout=5s period=10s #success=3 #failure=3 Environment: node.name: elasticsearch-master-0 (v1:metadata.name) cluster.initial_master_nodes: elasticsearch-master-0, node.roles: master,data,data_content,data_hot,data_warm,data_cold,ingest,ml,remote_cluster_client,transform, discovery.seed_hosts: elasticsearch-master-headless cluster.name: elasticsearch network.host: 0.0.0.0 ELASTIC_PASSWORD: <set to the key 'password' in secret 'elasticsearch-master-credentials'> Optional: false xpack.security.enabled: true xpack.security.transport.ssl.enabled: true xpack.security.http.ssl.enabled: true xpack.security.transport.ssl.verification_mode: certificate xpack.security.transport.ssl.key: /usr/share/elasticsearch/config/certs/tls.key xpack.security.transport.ssl.certificate: /usr/share/elasticsearch/config/certs/tls.crt xpack.security.transport.ssl.certificate_authorities: /usr/share/elasticsearch/config/certs/ca.crt xpack.security.http.ssl.key: /usr/share/elasticsearch/config/certs/tls.key xpack.security.http.ssl.certificate: /usr/share/elasticsearch/config/certs/tls.crt xpack.security.http.ssl.certificate_authorities: /usr/share/elasticsearch/config/certs/ca.crt Mounts: /usr/share/elasticsearch/config/certs from elasticsearch-certs (ro) /usr/share/elasticsearch/data from elasticsearch-master (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-lz5sz (ro) Volumes: elasticsearch-master: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: elasticsearch-master-elasticsearch-master-0 ReadOnly: false elasticsearch-certs: Type: Secret (a volume populated by a Secret) SecretName: elasticsearch-master-certs Optional: false kube-api-access-lz5sz: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: Burstable Node-Selectors: Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s node.kubernetes.io/unreachable:NoExecute op=Exists for 300s Events: Type Reason Age From Message


Warning FailedScheduling 6m7s default-scheduler running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition`

the pods are in pending state When I describe the pod Error : running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition

When i run jaeger , the collector, query, both pod are Crashing observability jaeger-operator-fddbfdbcf-rztrq 2/2 Running 0 2m44s observability jaeger-prod-collector-568d654d-5h926 0/1 CrashLoopBackOff 4 (29s ago) 2m20s observability jaeger-prod-query-69fbfbb9c8-l2jzf 1/2 CrashLoopBackOff 4 (22s ago) 2m11s opentelemetry-operator-system opentelemetry-operator-controller-manager-8d57b497f-4k9gx 2/2 Running 0 4m46s

dshamanthreddy commented 1 year ago

If you are using helm to install ES, what is repo url ?

adityajoshi12 commented 1 year ago

can you share the logs of the CrashLoopBackOff pods ?

dshamanthreddy commented 1 year ago

Hey @adityajoshi12 I can't get logs

% kubectl logs elasticsearch-master-0 Defaulted container "elasticsearch" out of: elasticsearch, configure-sysctl (init)

When i describe Events: Type Reason Age From Message


Warning FailedScheduling 31s default-scheduler running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition

When I check pvc kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE elasticsearch-master-elasticsearch-master-0 Pending gp2 21m

kubectl describe pvc elasticsearch-master-elasticsearch-master-0

Events: Type Reason Age From Message


Normal WaitForFirstConsumer 40m persistentvolume-controller waiting for first consumer to be created before binding Warning ProvisioningFailed 11m ebs.csi.aws.com_ebs-csi-controller-884c7f59f-xfctr_eec96684-2a44-4f7a-9932-2b24ddcb09e8 failed to provision volume with StorageClass "gp2": rpc error: code = Internal desc = Could not create volume "pvc-49cb6ae6-0670-4c8f-a1ef-e71204cc3dac": could not create volume in EC2: UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: 4uZW7Y2k_eV6kmx7THDKiv7ukriKsutDROLZiP8cY-Xy4pg-N_kqXwT2BYlyoKu3OHBGYOHqYA7o7rwgoT3M0yI2AQROiS9U4NZKGk8Iv3L_CgoN_ZUVaiZN1uURaQqTD0POabuSCNwjoxc3SSFdKZZpyOLJcHMCCdspSkvwKnQLSKVpY63W5p53pWtGJkB-2R-HJ9GZIL1kcug-q6WC1uc454mnpmz-oOs1nfK4G09MLD0Lm1ukr9YddWQ4CslBYn8TVPy-lEAsHFl0N-L_KlYNHHkZoHI_ERHsMB_PQVQLbjRVXcuP7mHp_EyyX7OXRhX6qAPyZr_1gLvuGajXJVU4inqaw2o9RPxPIBab3Au1n5b2UPfNDklo7JeIPxCCsX6jl6wjEh3o44qsA0HvxAeiEY0WCy2047MjMWIHaIR0XTHgv5rYFFS_WxPyDg379rrsfbAmCmdbTcAXfoHx1sM3qsQCD1c9HGyf9IzFOM826YlqiSdyTh_9HaJAkpNn9_ca4MhVDTdb2yLfm7VxxMY3tgSZIURMBRhc9QM_XigGsam6F1j2RoED0G1AeeFNyQhHN6gJW601azE5deWmpIhWu22XuOC67JKbr8zFSj7jFlckM8uYbactXBBywr_RotQfWa5ovFCNcmOgvkaT7I7huZ5g status code: 403, request id: cc48f53f-b6d7-41fb-90c2-0d5b6b669842 Warning ProvisioningFailed 11m ebs.csi.aws.com_ebs-csi-controller-884c7f59f-xfctr_eec96684-2a44-4f7a-9932-2b24ddcb09e8 failed to provision volume with StorageClass "gp2": rpc error: code = Internal desc = Could not create volume "pvc-49cb6ae6-0670-4c8f-a1ef-e71204cc3dac": could not create volume in EC2: UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: LbWlyTtjrE5JsUNZrxURE4FujrJdKGZ_nhOaiaYLeZnP2yiWlbVZ-Zhw3bSCO1-dVeSAhLZgFEH_E4uKBPFiOhpDQ6XA9K-ynr0J7_OeipuXZ_g0vmTmbDTFa8kwfBBJYr9UIGEkjn-oBuwVUXRCXtIKkmtyJd8OC9141Oqrza2KtB-FjAscK4h2lEPZhpZvhPRXUlnf1LgIfMSoApyyAByGfPR8hsGvuhtZgL-Qi1FQelVwZx8rNbmCSpfvWPb0ZxkLXZHhrS5jcmzBFpbTVkWIMt7De-t911EhJ7eYBlnCGEi6MsOL00sGHJkBlwm5wgWa-iKKdgQ0FKp4tNo6CkCtizkCGmEgfwaIefxlorWswIPRaGzQup6zzhEl4H2iQlqYmTk9c38bL7IXW8ysxB4HJ42Mi3wOD2bnIUurVI0iuBuUHRigpTXh9QxZ0b7PY6t_An876dpWRb4JFzWyqL8C4EMHWiojtKVwLRofSN9ryophMeFePiWdKjmwppPv8gUFpWPry9ZD9p2IZ0-Te_El2R3sYrH68Lz6XjKa_lc8FZxQ_GRjQ2g7Pbhl3016myyOxKIKzGhrUpddPulth_ntXE9CKcsEw-H9duC_HOQaTQFavZQOWjXc-rpK-RHskLuXKvxAscfds532NOh_iCGxhyccnA status code: 403, request id: 1b830a25-9f50-4a57-9b0f-1e9963912c15 Warning ProvisioningFailed 11m ebs.csi.aws.com_ebs-csi-controller-884c7f59f-xfctr_eec96684-2a44-4f7a-9932-2b24ddcb09e8 failed to provision volume with StorageClass "gp2": rpc error: code = Internal desc = Could not create volume "pvc-49cb6ae6-0670-4c8f-a1ef-e71204cc3dac": could not create volume in EC2: UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: Vs3F3kocaEtejoASxVHvBdfXkEUJZH2O7KAZ6vqHGuFx4bLhhVoqAVMTqZNULNOdFq45787OwyMZ58QVHBfM5Z1sAOPpQ8x3zRgcEtre1zOvmJm5Oubjh5piGha2y6OXnI_51nZkCUXbVeI8W56S-yVJP3KFVjhdzJrEMucaEz7e61u-Frgo_SzoTg-s0o3x6n6bgCegbXuk-G12YFk1hSqaPqMl3ZxqmSAb-3SYSfm5ZJhOwg6Z85TZZ_0E30zRCGLOfKMKiY13Klu5kd-GafUg3qniCVcYRpECKb1GuxN4L8RVQhgPQkVTzLKkNw7qOgD25LeMhWzpzkSS3h5hC0xGHzqD9egpYvI3jqE5InK15BQ3ih-dLMNDpvLL_LOGBrm-ztp7Q59qplI5pcdioTK0RFfKs6jtVqPcTHcWvUvFK14Am0Phai56qbyAIM8lIhg2WNJiVVVpH6HalUaZgoHxZLiUOBAX49eq8QZz0Fa0mhRFZcjQkuhhs40MI_AS3r_nXMOkGXFxVt-S6kAw2MhyWVkysgqrArxaPMKewTXgbvgGPFNRMnN2UNbircrmuT5hWrRP-RRnjcSUQuJMzfEU0h5aiimfQwzRq-NPQwal9F6gqUeJ6ZMFofxsGacws2tqmmFklG6-F5ulHGxcM91Vyr3NmQ status code: 403, request id: 44c62642-4ee6-42ac-9dd4-35cf0db264cd

User2707 commented 8 months ago

Hii @dshamanthreddy

I have the same issue of volume mount pending Could you be able to solve the issue ?