elastic / cloud-on-k8s

Elastic Cloud on Kubernetes
Other
56 stars 707 forks source link

Default Elasticsearch Installation stuck on "readiness probe failed" #6345

Closed melvin-suter closed 1 year ago

melvin-suter commented 1 year ago

Bug Report

What did you do?

I installed ECK without any issues. (Kibana for instance works fine.) I used this config to create an elasticsearch instance:

apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
  name: elastic
spec:
  version: 8.6.0
  nodeSets:
  - name: default
    count: 1
    config:
      node.store.allow_mmap: false
    volumeClaimTemplates:
    - metadata:
        name: elasticsearch-data
      spec:
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 10Gi
        storageClassName: nfs-rwo

What did you expect to see?

A running and ready elasticsearch pod.

What did you see instead? Under which circumstances?

The pod starts and is shown as "running", but I can't seem to get it to work. The logs of the pod get filled with this entry:

{"timestamp": "2023-01-19T22:50:37+00:00", "message": "readiness probe failed", "curl_rc": "7"}

If I open a shell inside the pod I cant curl 172.0.0.1:9200 .

Environment

I'm running a RKE2 on premise cluster. v1.24.8+rke2r1

naemono commented 1 year ago

@melvin-suter It would be incredibly hard to troubleshoot why Elasticsearch isn't functioning in your specific environment without much more details, but the one thing that does stand out is storageClassName: nfs-rwo. Are you using an NFS backend to run Elasticsearch? What storage system is backing this?

If you could try and start a test cluster with nothing that couldn't be shared publicly, and run https://github.com/elastic/eck-diagnostics against it, and upload them here, we could try and assist further.

naemono commented 1 year ago

Also look through these old issues that could be relevant:

https://github.com/elastic/cloud-on-k8s/issues/3877 https://github.com/elastic/cloud-on-k8s/issues/3701 https://github.com/elastic/cloud-on-k8s/issues/3293

Do any of the initContainers show any errors?

barkbay commented 1 year ago

A few things odd in the logs:

I am closing this due to inactivity. If you want to reopen, could you verify that the manifest you provided is the one actually deployed? Thanks.