jakubhajek / elasticsearch-docker-swarm

Elasticsearch Cluster on Docker swarm Cluster.
102 stars 65 forks source link

[o.e.d.z.UnicastZenPing ] [es-coordination] failed to resolve host [master3] #12

Open ravibhooshan opened 2 years ago

ravibhooshan commented 2 years ago

Hi, Thanks for this repo. I am trying to build a ES 6.8.23 cluster on 3 node Docker SWARM. I am following your code, but not able to run. Always get this error: [2022-02-01T17:48:19,911][WARN ][o.e.d.z.ZenDiscovery ] [es-coordination] not enough master nodes discovered during pinging (found [[]], but needed [2]), pinging again [2022-02-01T17:48:19,919][WARN ][o.e.d.z.UnicastZenPing ] [es-coordination] failed to resolve host [master1]

Below is stack file. Can you please help me on figuring out this error.

version: "3.7"

services: coordination: image: XXXXXXXX.com:8444/elastic:main ulimits: memlock: soft: -1 hard: -1 healthcheck: test: curl -fs http://localhost:9200/_cat/health || exit 1 interval: 30s timeout: 5s retries: 3 start_period: 45s configs:

networks: esnet: driver: overlay attachable: true name: esnet proxy: driver: overlay name: proxy

volumes: esmaster1: esmaster2: esmaster3:

esdata1: esdata2: esdata3:

configs: es-coordination: name: es-coordination file: es-config/es-coordination.yml es-master1: name: es-master1 file: es-config/es-master1.yml es-master2: name: es-master2 file: es-config/es-master2.yml es-master3: name: es-master3 file: es-config/es-master3.yml

es-data1: name: es-data1 file: es-config/es-data1.yml es-data2: name: es-data2 file: es-config/es-data2.yml es-data3: name: es-data3 file: es-config/es-data3.yml es-data4: name: es-data4 file: es-config/es-data4.yml

Sandeepbharmoria commented 2 years ago

Yes , I confirm same issue at my side as well. The docker file have issues, which I fixed , but still on esnet network master nodes are unable to find each other as per yml files.

gunman808 commented 2 months ago

Isn't it a bad idea to monitor the health state of the cluster service and not of the container itself? In this case the cluster will never be able to start, because in docker swarm, the name of the container will only be available if the container is in state "healthy". So it will never be healthy, because the cluster will never be green, because the needed elastissearch nodes could not be resolved. It's a deadlock. The cluster will work fine, if you remove the health check. It is not quite easy, to get a good health check for this case.