adityajoshi12 / opentelemetry-samples

2 stars 2 forks source link

Elastic search Pod are pending on EKS cluster #1

Open dshamanthreddy opened 1 year ago

dshamanthreddy commented 1 year ago

Hello @adityajoshi12

My Env details

EKS Cluster version : 1.24 Cluster on AWS EKS Nodegroup : 3 Instance family : t2.medium

Was trying to install via helm helm repo add elastic helm install elasticsearch elastic/elasticsearch --set replicas=1

`kubectl describe pod elasticsearch-master-0 Name: elasticsearch-master-0 Namespace: default Priority: 0 Service Account: default Node: Labels: app=elasticsearch-master chart=elasticsearch controller-revision-hash=elasticsearch-master-6bfccdfb68 release=elasticsearch Annotations: eks.privileged Status: Pending IP: IPs: Controlled By: StatefulSet/elasticsearch-master Init Containers: configure-sysctl: Image: Port: Host Port: Command: sysctl -w vm.max_map_count=262144 Environment: Mounts: /var/run/secrets/ from kube-api-access-lz5sz (ro) Containers: elasticsearch: Image: Ports: 9200/TCP, 9300/TCP Host Ports: 0/TCP, 0/TCP Limits: cpu: 1 memory: 2Gi Requests: cpu: 1 memory: 2Gi Readiness: exec [bash -c set -e

Exit if ELASTIC_PASSWORD in unset

if [ -z "${ELASTIC_PASSWORD}" ]; then echo "ELASTIC_PASSWORD variable is missing, exiting" exit 1 fi

If the node is starting up wait for the cluster to be ready (request params: "wait_for_status=green&timeout=1s" )

Once it has started only check that the node itself is responding


Disable nss cache to avoid filling dentry cache when calling curl

This is required with Elasticsearch Docker using nss < 3.52


http () { local path="${1}" local args="${2}" set -- -XGET -s

if [ "$args" != "" ]; then set -- "$@" $args fi

set -- "$@" -u "elastic:${ELASTIC_PASSWORD}"

curl --output /dev/null -k "$@" "${path}" }

if [ -f "${START_FILE}" ]; then echo 'Elasticsearch is already running, lets check the node is healthy' HTTP_CODE=$(http "/" "-w %{http_code}") RC=$? if [[ ${RC} -ne 0 ]]; then echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} failed with RC ${RC}" exit ${RC} fi

ready if HTTP code 200, 503 is tolerable if ES version is 6.x

if [[ ${HTTP_CODE} == "200" ]]; then exit 0 elif [[ ${HTTP_CODE} == "503" && "8" == "6" ]]; then exit 0 else echo "curl --output /dev/null -k -XGET -s -w '%{http_code}' \${BASIC_AUTH} failed with HTTP code ${HTTP_CODE}" exit 1 fi

else echo 'Waiting for elasticsearch cluster to become ready (request params: "wait_for_status=green&timeout=1s" )' if http "/_cluster/health?wait_for_status=green&timeout=1s" "--fail" ; then touch ${START_FILE} exit 0 else echo 'Cluster is not yet ready (request params: "wait_for_status=green&timeout=1s" )' exit 1 fi fi ] delay=10s timeout=5s period=10s #success=3 #failure=3 Environment: elasticsearch-master-0 ( cluster.initial_master_nodes: elasticsearch-master-0, node.roles: master,data,data_content,data_hot,data_warm,data_cold,ingest,ml,remote_cluster_client,transform, discovery.seed_hosts: elasticsearch-master-headless elasticsearch ELASTIC_PASSWORD: <set to the key 'password' in secret 'elasticsearch-master-credentials'> Optional: false true true true certificate /usr/share/elasticsearch/config/certs/tls.key /usr/share/elasticsearch/config/certs/tls.crt /usr/share/elasticsearch/config/certs/ca.crt /usr/share/elasticsearch/config/certs/tls.key /usr/share/elasticsearch/config/certs/tls.crt /usr/share/elasticsearch/config/certs/ca.crt Mounts: /usr/share/elasticsearch/config/certs from elasticsearch-certs (ro) /usr/share/elasticsearch/data from elasticsearch-master (rw) /var/run/secrets/ from kube-api-access-lz5sz (ro) Volumes: elasticsearch-master: Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace) ClaimName: elasticsearch-master-elasticsearch-master-0 ReadOnly: false elasticsearch-certs: Type: Secret (a volume populated by a Secret) SecretName: elasticsearch-master-certs Optional: false kube-api-access-lz5sz: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: DownwardAPI: true QoS Class: Burstable Node-Selectors: Tolerations: op=Exists for 300s op=Exists for 300s Events: Type Reason Age From Message

Warning FailedScheduling 6m7s default-scheduler running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition`

the pods are in pending state When I describe the pod Error : running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition

When i run jaeger , the collector, query, both pod are Crashing observability jaeger-operator-fddbfdbcf-rztrq 2/2 Running 0 2m44s observability jaeger-prod-collector-568d654d-5h926 0/1 CrashLoopBackOff 4 (29s ago) 2m20s observability jaeger-prod-query-69fbfbb9c8-l2jzf 1/2 CrashLoopBackOff 4 (22s ago) 2m11s opentelemetry-operator-system opentelemetry-operator-controller-manager-8d57b497f-4k9gx 2/2 Running 0 4m46s

dshamanthreddy commented 1 year ago

If you are using helm to install ES, what is repo url ?

adityajoshi12 commented 1 year ago

can you share the logs of the CrashLoopBackOff pods ?

dshamanthreddy commented 1 year ago

Hey @adityajoshi12 I can't get logs

% kubectl logs elasticsearch-master-0 Defaulted container "elasticsearch" out of: elasticsearch, configure-sysctl (init)

When i describe Events: Type Reason Age From Message

Warning FailedScheduling 31s default-scheduler running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition

When I check pvc kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE elasticsearch-master-elasticsearch-master-0 Pending gp2 21m

kubectl describe pvc elasticsearch-master-elasticsearch-master-0

Events: Type Reason Age From Message

Normal WaitForFirstConsumer 40m persistentvolume-controller waiting for first consumer to be created before binding Warning ProvisioningFailed 11m failed to provision volume with StorageClass "gp2": rpc error: code = Internal desc = Could not create volume "pvc-49cb6ae6-0670-4c8f-a1ef-e71204cc3dac": could not create volume in EC2: UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: 4uZW7Y2k_eV6kmx7THDKiv7ukriKsutDROLZiP8cY-Xy4pg-N_kqXwT2BYlyoKu3OHBGYOHqYA7o7rwgoT3M0yI2AQROiS9U4NZKGk8Iv3L_CgoN_ZUVaiZN1uURaQqTD0POabuSCNwjoxc3SSFdKZZpyOLJcHMCCdspSkvwKnQLSKVpY63W5p53pWtGJkB-2R-HJ9GZIL1kcug-q6WC1uc454mnpmz-oOs1nfK4G09MLD0Lm1ukr9YddWQ4CslBYn8TVPy-lEAsHFl0N-L_KlYNHHkZoHI_ERHsMB_PQVQLbjRVXcuP7mHp_EyyX7OXRhX6qAPyZr_1gLvuGajXJVU4inqaw2o9RPxPIBab3Au1n5b2UPfNDklo7JeIPxCCsX6jl6wjEh3o44qsA0HvxAeiEY0WCy2047MjMWIHaIR0XTHgv5rYFFS_WxPyDg379rrsfbAmCmdbTcAXfoHx1sM3qsQCD1c9HGyf9IzFOM826YlqiSdyTh_9HaJAkpNn9_ca4MhVDTdb2yLfm7VxxMY3tgSZIURMBRhc9QM_XigGsam6F1j2RoED0G1AeeFNyQhHN6gJW601azE5deWmpIhWu22XuOC67JKbr8zFSj7jFlckM8uYbactXBBywr_RotQfWa5ovFCNcmOgvkaT7I7huZ5g status code: 403, request id: cc48f53f-b6d7-41fb-90c2-0d5b6b669842 Warning ProvisioningFailed 11m failed to provision volume with StorageClass "gp2": rpc error: code = Internal desc = Could not create volume "pvc-49cb6ae6-0670-4c8f-a1ef-e71204cc3dac": could not create volume in EC2: UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: LbWlyTtjrE5JsUNZrxURE4FujrJdKGZ_nhOaiaYLeZnP2yiWlbVZ-Zhw3bSCO1-dVeSAhLZgFEH_E4uKBPFiOhpDQ6XA9K-ynr0J7_OeipuXZ_g0vmTmbDTFa8kwfBBJYr9UIGEkjn-oBuwVUXRCXtIKkmtyJd8OC9141Oqrza2KtB-FjAscK4h2lEPZhpZvhPRXUlnf1LgIfMSoApyyAByGfPR8hsGvuhtZgL-Qi1FQelVwZx8rNbmCSpfvWPb0ZxkLXZHhrS5jcmzBFpbTVkWIMt7De-t911EhJ7eYBlnCGEi6MsOL00sGHJkBlwm5wgWa-iKKdgQ0FKp4tNo6CkCtizkCGmEgfwaIefxlorWswIPRaGzQup6zzhEl4H2iQlqYmTk9c38bL7IXW8ysxB4HJ42Mi3wOD2bnIUurVI0iuBuUHRigpTXh9QxZ0b7PY6t_An876dpWRb4JFzWyqL8C4EMHWiojtKVwLRofSN9ryophMeFePiWdKjmwppPv8gUFpWPry9ZD9p2IZ0-Te_El2R3sYrH68Lz6XjKa_lc8FZxQ_GRjQ2g7Pbhl3016myyOxKIKzGhrUpddPulth_ntXE9CKcsEw-H9duC_HOQaTQFavZQOWjXc-rpK-RHskLuXKvxAscfds532NOh_iCGxhyccnA status code: 403, request id: 1b830a25-9f50-4a57-9b0f-1e9963912c15 Warning ProvisioningFailed 11m failed to provision volume with StorageClass "gp2": rpc error: code = Internal desc = Could not create volume "pvc-49cb6ae6-0670-4c8f-a1ef-e71204cc3dac": could not create volume in EC2: UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: Vs3F3kocaEtejoASxVHvBdfXkEUJZH2O7KAZ6vqHGuFx4bLhhVoqAVMTqZNULNOdFq45787OwyMZ58QVHBfM5Z1sAOPpQ8x3zRgcEtre1zOvmJm5Oubjh5piGha2y6OXnI_51nZkCUXbVeI8W56S-yVJP3KFVjhdzJrEMucaEz7e61u-Frgo_SzoTg-s0o3x6n6bgCegbXuk-G12YFk1hSqaPqMl3ZxqmSAb-3SYSfm5ZJhOwg6Z85TZZ_0E30zRCGLOfKMKiY13Klu5kd-GafUg3qniCVcYRpECKb1GuxN4L8RVQhgPQkVTzLKkNw7qOgD25LeMhWzpzkSS3h5hC0xGHzqD9egpYvI3jqE5InK15BQ3ih-dLMNDpvLL_LOGBrm-ztp7Q59qplI5pcdioTK0RFfKs6jtVqPcTHcWvUvFK14Am0Phai56qbyAIM8lIhg2WNJiVVVpH6HalUaZgoHxZLiUOBAX49eq8QZz0Fa0mhRFZcjQkuhhs40MI_AS3r_nXMOkGXFxVt-S6kAw2MhyWVkysgqrArxaPMKewTXgbvgGPFNRMnN2UNbircrmuT5hWrRP-RRnjcSUQuJMzfEU0h5aiimfQwzRq-NPQwal9F6gqUeJ6ZMFofxsGacws2tqmmFklG6-F5ulHGxcM91Vyr3NmQ status code: 403, request id: 44c62642-4ee6-42ac-9dd4-35cf0db264cd

User2707 commented 8 months ago

Hii @dshamanthreddy

I have the same issue of volume mount pending Could you be able to solve the issue ?