Closed hatdropper1977 closed 5 years ago
Some details:
The fact that I cannot name containers in swarm (this may be true of other orchestration frameworks) causes the following issues
+1 to this, I was expecting a configuration like this to work:
transport.host: 0.0.0.0
cluster.name: docker-test-cluster
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts:
- tasks.dev_elasticsearch
tasks.
so far I see these logs instead:
[2017-08-01T04:56:18,540][INFO ][o.e.c.s.ClusterService ] [BsId4cc] removed {{BsId4cc}{BsId4cctQZOu9LWxcNzRXA}{Mk4RwWtaQCK7BYs_702PtQ}{10.0.0.2}{10.0.0.2:9300},}, added {{kAnIKWD}{kAnIKWD1QwSZV1iHDmK01w}{_nagtREAQPm5zM84DpJcog}{10.0.0.2}{10.0.0.2:9300},}, reason: zen-disco-elected-as-master ([1] nodes joined)[{kAnIKWD}{kAnIKWD1QwSZV1iHDmK01w}{_nagtREAQPm5zM84DpJcog}{10.0.0.2}{10.0.0.2:9300}]
[2017-08-01T04:56:18,541][WARN ][o.e.c.s.ClusterService ] [BsId4cc] failing [zen-disco-elected-as-master ([1] nodes joined)[{kAnIKWD}{kAnIKWD1QwSZV1iHDmK01w}{_nagtREAQPm5zM84DpJcog}{10.0.0.2}{10.0.0.2:9300}]]: failed to commit cluster state version [1]
org.elasticsearch.discovery.Discovery$FailedToCommitClusterStateException: unexpected error while preparing to publish
at org.elasticsearch.discovery.zen.PublishClusterStateAction.publish(PublishClusterStateAction.java:163) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.discovery.zen.ZenDiscovery.publish(ZenDiscovery.java:311) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.cluster.service.ClusterService.publishAndApplyChanges(ClusterService.java:741) [elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.cluster.service.ClusterService.runTasks(ClusterService.java:587) [elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.run(ClusterService.java:263) [elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) [elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) [elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:247) [elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:210) [elasticsearch-5.5.1.jar:5.5.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
Caused by: org.elasticsearch.discovery.Discovery$FailedToCommitClusterStateException: not enough masters to ack sent cluster state.[1] needed , have [0]
at org.elasticsearch.discovery.zen.PublishClusterStateAction$SendingController.<init>(PublishClusterStateAction.java:555) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.discovery.zen.PublishClusterStateAction$SendingController.<init>(PublishClusterStateAction.java:527) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.discovery.zen.PublishClusterStateAction.publish(PublishClusterStateAction.java:160) ~[elasticsearch-5.5.1.jar:5.5.1]
... 12 more
To reproduce: create the config file with the parameters at the top.
create compose file: service-compose.yml
version: "3.3"
services:
elasticsearch:
image: elasticsearch:alpine
ports:
- "9200:9200"
- "9300:9300"
environment:
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
volumes:
- ./elasticsearch/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
networks:
- backend
deploy:
replicas: 3
networks:
backend:
deploy: docker stack deploy -c service-compose.yml dev
It is a problem with how ES detects local IPs, if you have a docker swarm VIP setup it will basically detect the VIP IP and use that, which generates a conflict:
Caused by: java.lang.IllegalArgumentException: can't add node {H12KwKI}{H12KwKI7TUGB9qFBw7Uk_w}{ULNNEIrbS06Z5Z-CTDPWVQ}{10.0.0.2}{10.0.0.2:9300}{ml.enabled=true}, found existing node {DpGAJ1-}{DpGAJ1-qTdGqD27LRJJumA}{tLia8vwBRlWXMgVa-z5phw}{10.0.0.2}{10.0.0.2:9300}{ml.enabled=true} with same address
same node ID with 2 different IPs. Detection gives:
eth0
inet 10.0.0.2 netmask:255.255.255.0 broadcast:0.0.0.0 scope:site
inet 10.0.0.4 netmask:255.255.255.0 broadcast:0.0.0.0 scope:site
hardware 02:42:0A:00:00:04
UP MULTICAST mtu:1450 index:25
Because docker swarm adds the VIP locally for routing purposes. Below is a setup which works in bringing up a cluster but the downside is that you will have to point a LB yourself at 9200 on the manager nodes to get to ES (from outside).
version: '3.3'
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:5.5.1
environment:
- cluster.name=docker-cluster
- bootstrap.memory_lock=false
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- "discovery.zen.ping.unicast.hosts=tasks.elasticsearch"
deploy:
endpoint_mode: dnsrr
replicas: 3
resources:
limits:
memory: 1G
reservations:
memory: 512M
volumes:
- esdata:/usr/share/elasticsearch/data
networks:
- esnet
kibana:
image: docker.elastic.co/kibana/kibana:5.5.1
environment:
ELASTICSEARCH_URL: http://elasticsearch:9200
networks:
- esnet
ports:
- 5601:5601
volumes:
esdata:
driver: local
networks:
esnet:
Hope this helps. I could not (in a very quick search) find a way to override ES's network configuration detection.
Edit: endpoint_mode
needs compose file version 3.3 which means docker 17.06.0+
In summary, it would be a useful Elasticsearch feature if you could just auto-detect other Elasticsearch node containers on an overlay network.
@muresan - Real fast, can you explain what you mean by:
you will have to point a LB yourself at 9200 on the manager nodes to get to ES (from outside).
@hatdropper1977 it means that you have to frontend the 3 ES replicas with a load balancer that will round robin requests to the 3 instances. Basically because there is no VIP, you will have 3 ES instances with 3 IPs. Now if you use one of them, all the requests will be directed to that IP that you choose, if you want to spread the load among the 3, you can for example put in front an haproxy or nginx that will redirect all the request to the 3 backend ES. This is an example: https://sematext.com/blog/2016/12/12/docker-elasticsearch-swarm/
@fcrisciani In the Sematext example... Will this work with X-Pack/ TLS? Can the ES nodes communicate w/ each other using the 162.243.255.10.xip.io address? Or do they use the new, local, overlay addresses? Or the ephemeral container names? In other nodes, what do we put in the TLS cert's SAN? External nodes will reach the nodes via the 162 address so that's obvious. But how do the intra-overlay ES nodes communicate with each other?
@hatdropper1977 disclaimer is that I did not try myself that example. It was more to show you the idea of having a frontend container. I'm also not familiar with the X-Pack etc, but I expect it to work if nginx is properly configured, at the end will only act as a proxy. For what concerns the communication between es instances, that will happen on the docker overlay backend (you can configure it as encrypted)
@fcrisciani Thanks!
It appears that the Elasticsearch ecosystem (X-Pack security, etc.) doesn't play nicely with swarm and if I want to use swarm I'll need to swim against the current a bit. The hack I mention in the original comment would work as would the sematext approach. I would prefer native support but I see the reasons why that doesn't work.
@hatdropper1977 there is a way already, they are listed in DNS under tasks.<servicename>
,tasks.<stack_servicename>
, <servicename>
or <stack_servicename>
. Depending on the endpoint_mode you end up with different results. The problem is that with VIP endpoint ES will investigate the container and say "hey, there are 2 IPs on eth0, let's use the 1st" and that's the VIP endpoint, it will advertise it to other members, which use it to talk to ... themselves over the swarm balancer. I found no way to tell ES to use the 2nd IP, no way to get info from the interface to identify the VIP, you can heuristically say that the 1st available IP address is used for the VIP, (but that is not guaranteed 100%) so you can say "let's use the greater IP address number".
useful Elasticsearch feature if you could just auto-detect other Elasticsearch node containers on an overlay network
@hatdropper1977 @muresan with this change in docker: https://github.com/docker/libnetwork/pull/1877 I was able to spawn a cluster and scale it up and down.
docker compose:
version: "3.3"
services:
elasticsearch:
image: elasticsearch:alpine
ports:
- "9200:9200"
- "9300:9300"
environment:
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
volumes:
- ./elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
networks:
- backend
deploy:
replicas: 3
kibana:
image: kibana
ports:
- "5601:5601"
networks:
- backend
networks:
backend:
attachable: true
elasticsearch.yml:
network.host: _eth0:ipv4_
cluster.name: docker-test-cluster
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts:
- tasks.dev_elasticsearch
The change by itself is simply moving the VIP on the loopback interface, but still the important thing to do is to use: network.host: _eth0:ipv4_
. I have the feeling that elastic uses the lowest IP as ID so not specifying that one and using network.host:0.0.0.0 in my case they were still all clashing on the same IP
@fcrisciani that is great!. You can still have problems if you use multiple networks because there's no guarantee on eth0 mapping to the one you want. I've tested with 2 networks and I had eth2 and eth0 and eth2 was the one I wanted. If docker could rename the container interfaces to match the network names that would solve this but the VIP move to loopback
is a big step forward.
Edit: 2 networks use case is if you use network.bind_host
for the API front (9200) and network.publish_host
for the cluster gossip/traffic (9300).
I've been playing around with getting ES to work on swarm and have come up with a little POC.
There are a few things here of note.
the coordinating node exposes port 9200. Since I've been testing this on a single node swarm only one node can expose port 9200 to the host. So i decided to make that a coordinating node so that it's small and acts kinda like a reverse proxy.
using dnsrr
by using dnsrr swarm will route requests directly to the container IP as opposed to the virtual IP. I was noticing that by using the virtual IP's es nodes were struggling to discover themselves because of the multiple IP's per container.
the coordinating node is set as global
so that each host in the swarm cluster will expose 9200 only 1 time.
This has not solved anything regarding storage this is just a POC regarding the networking and independent scaling of ES nodes as services in swarm
I have only tested this locally on my laptop running docker 17.06-ce
how to use (obviously):
docker swarm init
docker stack deploy -c es.yml es
http://localhost:9200/_cat/nodes?v
docker service scale es_data=3
version: "3.3"
services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:5.5.1
environment:
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- "discovery.zen.minimum_master_nodes=2"
- "discovery.zen.ping.unicast.hosts=master"
- "node.master=false"
- "node.data=false"
- "node.ingest=false"
networks:
- esnet
ports:
- target: 9200
published: 9200
protocol: tcp
mode: host
deploy:
endpoint_mode: dnsrr
mode: 'global'
resources:
limits:
memory: 1G
ulimits:
memlock:
soft: -1
hard: -1
master:
image: docker.elastic.co/elasticsearch/elasticsearch:5.5.1
environment:
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- "discovery.zen.minimum_master_nodes=2"
- "discovery.zen.ping.unicast.hosts=master"
- "node.master=true"
- "node.data=false"
- "node.ingest=false"
networks:
- esnet
deploy:
endpoint_mode: dnsrr
mode: 'replicated'
replicas: 3
resources:
limits:
memory: 1G
ulimits:
memlock:
soft: -1
hard: -1
data:
image: docker.elastic.co/elasticsearch/elasticsearch:5.5.1
environment:
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
- "discovery.zen.minimum_master_nodes=2"
- "discovery.zen.ping.unicast.hosts=master"
- "node.master=false"
- "node.data=true"
- "node.ingest=false"
networks:
- esnet
deploy:
endpoint_mode: dnsrr
mode: 'replicated'
replicas: 1
resources:
limits:
memory: 1G
ulimits:
memlock:
soft: -1
hard: -1
networks:
esnet:
driver: overlay
@fcrisciani That's great!!!
Any idea if X-Pack (specifically TLS) can play nicely with this fix or is this just for vanilla HTTP?
I'm using 3 services running on 3 nodes (1 manager and 2 worker) to avoid the VIP question, but facing another problem.
docker-stack-es.yml
version: '3.2'
services:
elasticsearch1:
image: cbb/elasticsearch:5.5.0
environment:
ES_JAVA_OPTS: '-Xms256m -Xmx256m'
cluster.name: es-cluster
node.name: es1
network.bind_host: 0.0.0.0
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: tasks.elasticsearch2,tasks.elasticsearch3
xpack.security.enabled: 'false'
xpack.monitoring.enabled: 'false'
xpack.watcher.enabled: 'false'
xpack.ml.enabled: 'false'
http.cors.enabled: 'true'
http.cors.allow-origin: '*'
logger.level: debug
volumes:
- $VPATH/data/elasticsearch:/usr/share/elasticsearch/data
ports:
- 9200:9200
- 9300:9300
deploy:
replicas: 1
placement:
constraints:
- node.labels.no == 1
resources:
limits:
memory: 1g
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
nproc:
soft: 65536
hard: 65536
elasticsearch2:
image: cbb/elasticsearch:5.5.0
environment:
ES_JAVA_OPTS: '-Xms256m -Xmx256m'
cluster.name: es-cluster
node.name: es2
network.bind_host: 0.0.0.0
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: tasks.elasticsearch1,tasks.elasticsearch3
xpack.security.enabled: 'false'
xpack.monitoring.enabled: 'false'
xpack.watcher.enabled: 'false'
xpack.ml.enabled: 'false'
http.cors.enabled: 'true'
http.cors.allow-origin: '*'
logger.level: debug
volumes:
- $VPATH/data/elasticsearch:/usr/share/elasticsearch/data
ports:
- 9201:9200
- 9301:9300
deploy:
replicas: 1
placement:
constraints:
- node.labels.no == 2
resources:
limits:
memory: 1g
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
nproc:
soft: 65536
hard: 65536
elasticsearch3:
image: cbb/elasticsearch:5.5.0
environment:
ES_JAVA_OPTS: '-Xms256m -Xmx256m'
cluster.name: es-cluster
node.name: es3
network.bind_host: 0.0.0.0
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.unicast.hosts: tasks.elasticsearch1,tasks.elasticsearch2
xpack.security.enabled: 'false'
xpack.monitoring.enabled: 'false'
xpack.watcher.enabled: 'false'
xpack.ml.enabled: 'false'
http.cors.enabled: 'true'
http.cors.allow-origin: '*'
logger.level: debug
volumes:
- $VPATH/data/elasticsearch:/usr/share/elasticsearch/data
ports:
- 9202:9200
- 9302:9300
deploy:
replicas: 1
placement:
constraints:
- node.labels.no == 3
resources:
limits:
memory: 1g
ulimits:
memlock:
soft: -1
hard: -1
nofile:
soft: 65536
hard: 65536
nproc:
soft: 65536
hard: 65536
networks:
default:
external:
name: myoverlay
cbb/elasticsearch:5.5.0 is just a docker tag of official docker.elastic.co/elasticsearch/elasticsearch:5.5.0.
docker stack deploy -c ./docker-stack-es.yml es. Everything goes well, and /_cat/nodes?v
is ok too.
Notice that the cluster nodes ips are still VIP even throw I used tasks.ServiceName for zen discovery. The actual task ips are 10.10.0.3, 10.10.0.5, 10.10.0.7.
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name 10.10.0.6 27 93 5 0.15 0.06 0.01 mdi * es3 10.10.0.2 38 93 2 0.02 0.01 0.00 mdi - es1 10.10.0.4 33 94 2 0.04 0.04 0.06 mdi - es2
But after several minutes, docker exec -it container curl http://localhost:9200/_cat/nodes?v,
sometimes ok but sometimes not.
node1:
docker@node1:/Users/cbb/Dropbox/docker/sh$ docker exec -it es_elasticsearch1.1.obxncvu85mchi8tv14hhwv7aw curl http://localhost:9200/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.10.0.4 29 92 0 0.00 0.03 0.14 mdi - es2
10.10.0.2 30 69 3 0.00 0.01 0.13 mdi - es1
10.10.0.6 28 91 2 0.00 0.02 0.12 mdi * es3
docker@node1:/Users/cbb/Dropbox/docker/sh$ docker exec -it es_elasticsearch1.1.obxncvu85mchi8tv14hhwv7aw curl http://localhost:9200/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.10.0.4 mdi - es2
10.10.0.2 41 92 22 0.23 0.49 0.30 mdi - es1
10.10.0.6 47 92 25 0.60 0.61 0.35 mdi * es3
docker@node1:/Users/cbb/Dropbox/docker/sh$ docker exec -it es_elasticsearch1.1.obxncvu85mchi8tv14hhwv7aw curl http://l
ocalhost:9200/_cat/nodes?v
{"error":{"root_cause":[{"type":"node_disconnected_exception","reason":"[es3][10.10.0.6:9300][cluster:monitor/state] disconnected"}],"type":"master_not_discovered_exception","reason":"NodeDisconnectedException[[es3][10.10.0.6:9300][cluster:monitor/state] disconnected]","caused_by":{"type":"node_disconnected_exception","reason":"[es3][10.10.0.6:9300][cluster:modocker@node1:/Users/cbb/Dropbox/docker/sh$ docker exec -it es_elasticsearch1.1.obxncvu85mchi8tv14hhwv7aw curl http://localhost:9200/_cat/nodes?v
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}
docker@node1:/Users/cbb/Dropbox/docker/sh$ docker exec -it es_elasticsearch1.1.obxncvu85mchi8tv14hhwv7aw curl http://localhost:9200/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.10.0.4 45 92 1 0.00 0.01 0.00 mdi - es2
10.10.0.2 48 94 1 0.00 0.01 0.00 mdi - es1
10.10.0.6 40 93 1 0.03 0.03 0.00 mdi * es3
10.10.0.4 mdi - es2 10.10.0.2 41 92 22 0.23 0.49 0.30 mdi - es1 10.10.0.6 47 92 25 0.60 0.61 0.35 mdi * es3
there are some blanks after 10.10.0.4
node2:
docker@node2:~$ docker exec -it es_elasticsearch2.1.uqeo3p66yt5pcm7ytsgfn5rqa curl http://localhost:9200/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.10.0.6 28 91 0 0.00 0.02 0.11 mdi * es3
10.10.0.4 29 93 0 0.00 0.02 0.13 mdi - es2
10.10.0.2 31 69 1 0.00 0.01 0.12 mdi - es1
docker@node2:~$ docker exec -it es_elasticsearch2.1.uqeo3p66yt5pcm7ytsgfn5rqa curl http://localhost:9200/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.10.0.6 50 92 2 0.08 0.40 0.31 mdi * es3
10.10.0.4 42 93 18 0.12 0.36 0.28 mdi - es2
10.10.0.2 mdi - es1
docker@node2:~$ docker exec -it es_elasticsearch2.1.uqeo3p6
6yt5pcm7ytsgfn5rqa curl http://localhost:9200/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.10.0.6 mdi * es3
10.10.0.4 39 93 0 0.08 0.04 0.03 mdi - es2
10.10.0.2 mdi - es1
there are some blanks after 10.10.0.6 and 10.10.0.2.
node3:
docker@node3:~$ docker exec -it es_elasticsearch3.1.axcjyne2xda88t5y9owxgs2oz curl http://localhost:9200/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.10.0.6 36 91 13 0.01 0.02 0.08 mdi * es3
10.10.0.2 35 69 1 0.01 0.01 0.08 mdi - es1
10.10.0.4 33 93 0 0.05 0.02 0.09 mdi - es2
docker@node3:~$ docker exec -it es_elasticsearch3.1.axcjyne2xda88t5y9owxgs2oz curl http://localhost:9200/_cat/nodes?v
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":null}],"type":"master_not_discovered_exception","reason":null},"status":503}docker@node3:~$
docker@node3:~$ docker exec -it es_elasticsearch3.1.axcjyne2xda88t5y9owxgs2oz curl http://localhost:9200/_cat/nodes?v
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.10.0.6 47 93 1 0.00 0.00 0.00 mdi * es3
10.10.0.2 34 93 1 0.08 0.04 0.02 mdi - es1
10.10.0.4 29 93 1 0.11 0.08 0.07 mdi - es2
Another logstash service using the es output plugin on the same overlay network have some repeated logs like this:
logstash_logstash.0.96vmfjsm9w9g@node1 | [2017-08-04T08:44:06,387][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>#<Java::JavaNet::URI:0xe845a46>, :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::BadResponseCodeError, :error=>"Got response code '503' contacting Elasticsearch at URL 'http://10.10.0.4:9200/'"}
logstash_logstash.0.96vmfjsm9w9g@node1 | [2017-08-04T08:44:07,357][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://10.10.0.4:9200/, :path=>"/"}
logstash_logstash.0.96vmfjsm9w9g@node1 | [2017-08-04T08:44:07,362][WARN ][logstash.outputs.elasticsearch] Attempted to resurrect connection to dead ES instance, but got an error. {:url=>#<Java::JavaNet::URI:0xe845a46>, :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::BadResponseCodeError, :error=>"Got response code '503' contacting Elasticsearch at URL 'http://10.10.0.4:9200/'"}
logstash_logstash.0.96vmfjsm9w9g@node1 | [2017-08-04T08:44:11,389][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://10.10.0.4:9200/, :path=>"/"}
logstash_logstash.0.96vmfjsm9w9g@node1 | [2017-08-04T08:44:11,396][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>#<Java::JavaNet::URI:0xe845a46>}
logstash_logstash.0.96vmfjsm9w9g@node1 | [2017-08-04T09:19:21,735][WARN ][logstash.outputs.elasticsearch] Marking url as dead. Last error: [LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError] Elasticsearch Unreachable: [http://10.10.0.4:9200/][Manticore::SocketTimeout] Read timed out {:url=>http://10.10.0.4:9200/, :error_message=>"Elasticsearch Unreachable: [http://10.10.0.4:9200/][Manticore::SocketTimeout] Read timed out", :error_class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError"}
logstash_logstash.0.96vmfjsm9w9g@node1 | [2017-08-04T09:19:21,735][WARN ][logstash.outputs.elasticsearch] Error while performing sniffing {:error_message=>"Elasticsearch Unreachable: [http://10.10.0.4:9200/][Manticore::SocketTimeout] Read timed out", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError", :backtrace=>["/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.3.6-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:271:in `perform_request_to_url'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.3.6-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:269:in `perform_request_to_url'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.3.6-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:257:in `perform_request'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.3.6-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:347:in `with_connection'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.3.6-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:256:in `perform_request'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.3.6-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:157:in `check_sniff'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.3.6-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:150:in `sniff!'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.3.6-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:139:in `start_sniffer'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.3.6-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:121:in `until_stopped'", "/usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-7.3.6-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:137:in `start_sniffer'"]}
logstash_logstash.0.96vmfjsm9w9g@node1 | [2017-08-04T09:19:21,750][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://10.10.0.4:9200/, :path=>"/"}
logstash_logstash.0.96vmfjsm9w9g@node1 | [2017-08-04T09:19:22,196][INFO ][logstash.outputs.elasticsearch] Running health check to see if an Elasticsearch connection is working {:healthcheck_url=>http://10.10.0.4:9200/, :path=>"/"}
logstash_logstash.0.96vmfjsm9w9g@node1 | [2017-08-04T09:19:22,200][WARN ][logstash.outputs.elasticsearch] Restored connection to ES instance {:url=>#<Java::JavaNet::URI:0xe845a46>}
@realcbb switch to using endpoint_mode: dnsrr
because you do not need the swarm loadbalancer for 1 task, then the proper IPs shoud show.
@muresan Yes, I could use dnsrr mode. But VIP mode in my case is supposed not to be the problem. Right?
@realcbb VIP is the problem, docker adds it to the same interface as the network interface and ES sees it and uses it to advertise an outgoing IP. There's a patch few commends up from @fcrisciani that moves the IP to loopback which solves this problem. Well VIP mode causes the VIP IPs to show up, not sure what causes the rest. Also because it was too long, here's a way to shorten the YAML file using anchors: https://gist.github.com/muresan/c2b21e0e2d5cc68bc1bce43c6e69e957
Thanks. VIP ips is because of VIP mode indeedly. I mean that, even though discovery.zen.ping.unicast.hosts
set with tasks.serviceName, ES still use VIP as the advertised host. So the setting in fact does not matter.
And even though ES use VIP as the advertised host, why the cluster has the error in my case?
@realcbb I see you have debug enabled logger.level: debug
maybe you can find more info in the logs (docker service logs <servicename>
)
I removed the es stack, deleted content in es data folder on each nodes, and then redeploy es stack. Everything goes well like before.
ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
10.10.0.6 28 81 1 0.02 0.06 0.07 mdi - es3
10.10.0.2 42 74 4 0.00 0.02 0.02 mdi * es1
10.10.0.4 28 81 1 0.00 0.00 0.00 mdi - es2
After some minutes, I catched some bad es logs just when I docker stack deploy ./docker-stack-logstash logstash
.
These logs migth be too long...
node1:
[2017-08-04T15:10:37,013][DEBUG][o.e.c.s.ClusterService ] [es1] processing [create-index-template [logstash], cause [api]]: execute
[2017-08-04T15:10:37,029][DEBUG][o.e.i.IndicesService ] [es1] creating Index [[uQngaeCgT2iu0sWTxdkzPg/UPnWk2rxSxGjZ7dN3DvEOA]], shards [1]/[0] - reason [create index]
[2017-08-04T15:10:37,067][DEBUG][o.e.i.s.IndexStore ] [es1] [uQngaeCgT2iu0sWTxdkzPg] using index.store.throttle.type [NONE], with index.store.throttle.max_bytes_per_sec [null]
[2017-08-04T15:10:37,173][DEBUG][o.e.i.m.MapperService ] [es1] [uQngaeCgT2iu0sWTxdkzPg] using dynamic[true]
[2017-08-04T15:10:37,294][WARN ][o.e.d.i.m.TypeParsers ] field [include_in_all] is deprecated, as [_all] is deprecated, and will be disallowed in 6.0, use [copy_to] instead.
[2017-08-04T15:10:37,381][WARN ][o.e.d.i.m.TypeParsers ] field [include_in_all] is deprecated, as [_all] is deprecated, and will be disallowed in 6.0, use [copy_to] instead.
[2017-08-04T15:10:37,540][DEBUG][o.e.i.IndicesService ] [es1] [uQngaeCgT2iu0sWTxdkzPg] closing ... (reason [NO_LONGER_ASSIGNED])
[2017-08-04T15:10:37,540][DEBUG][o.e.i.IndicesService ] [es1] [uQngaeCgT2iu0sWTxdkzPg/UPnWk2rxSxGjZ7dN3DvEOA] closing index service (reason [NO_LONGER_ASSIGNED][ created for parsing template mapping])
[2017-08-04T15:10:37,540][DEBUG][o.e.i.c.b.BitsetFilterCache] [es1] [uQngaeCgT2iu0sWTxdkzPg] clearing all bitsets because [close]
[2017-08-04T15:10:37,544][DEBUG][o.e.i.c.q.IndexQueryCache] [es1] [uQngaeCgT2iu0sWTxdkzPg] full cache clear, reason [close]
[2017-08-04T15:10:37,545][DEBUG][o.e.i.c.b.BitsetFilterCache] [es1] [uQngaeCgT2iu0sWTxdkzPg] clearing all bitsets because [close]
[2017-08-04T15:10:37,550][DEBUG][o.e.i.IndicesService ] [es1] [uQngaeCgT2iu0sWTxdkzPg/UPnWk2rxSxGjZ7dN3DvEOA] closed... (reason [NO_LONGER_ASSIGNED][ created for parsing template mapping])
[2017-08-04T15:10:37,550][DEBUG][o.e.c.s.ClusterService ] [es1] cluster state updated, version [5], source [create-index-template [logstash], cause [api]]
[2017-08-04T15:10:37,550][DEBUG][o.e.c.s.ClusterService ] [es1] publishing cluster state version [5]
[2017-08-04T15:11:07,556][DEBUG][o.e.d.z.ZenDiscovery ] [es1] failed to publish cluster state version [5] (not enough nodes acknowledged, min master nodes [2])
[2017-08-04T15:11:07,559][WARN ][o.e.c.s.ClusterService ] [es1] failing [create-index-template [logstash], cause [api]]: failed to commit cluster state version [5]
org.elasticsearch.discovery.Discovery$FailedToCommitClusterStateException: timed out while waiting for enough masters to ack sent cluster state. [1] left
at org.elasticsearch.discovery.zen.PublishClusterStateAction$SendingController.waitForCommit(PublishClusterStateAction.java:574) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.PublishClusterStateAction.innerPublish(PublishClusterStateAction.java:202) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.PublishClusterStateAction.publish(PublishClusterStateAction.java:167) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.ZenDiscovery.publish(ZenDiscovery.java:311) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService.publishAndApplyChanges(ClusterService.java:741) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService.runTasks(ClusterService.java:587) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.run(ClusterService.java:263) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:247) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:210) [elasticsearch-5.5.0.jar:5.5.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
[2017-08-04T15:11:07,569][DEBUG][o.e.a.a.i.t.p.TransportPutIndexTemplateAction] [es1] failed to put template [logstash]
org.elasticsearch.discovery.Discovery$FailedToCommitClusterStateException: timed out while waiting for enough masters to ack sent cluster state. [1] left
at org.elasticsearch.discovery.zen.PublishClusterStateAction$SendingController.waitForCommit(PublishClusterStateAction.java:574) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.PublishClusterStateAction.innerPublish(PublishClusterStateAction.java:202) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.PublishClusterStateAction.publish(PublishClusterStateAction.java:167) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.ZenDiscovery.publish(ZenDiscovery.java:311) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService.publishAndApplyChanges(ClusterService.java:741) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService.runTasks(ClusterService.java:587) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.run(ClusterService.java:263) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:247) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:210) [elasticsearch-5.5.0.jar:5.5.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
[2017-08-04T15:11:07,569][DEBUG][o.e.a.a.i.t.p.TransportPutIndexTemplateAction] [es1] master could not publish cluster state or stepped down before publishing action [indices:admin/template/put], scheduling a retry
org.elasticsearch.discovery.Discovery$FailedToCommitClusterStateException: timed out while waiting for enough masters to ack sent cluster state. [1] left
at org.elasticsearch.discovery.zen.PublishClusterStateAction$SendingController.waitForCommit(PublishClusterStateAction.java:574) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.PublishClusterStateAction.innerPublish(PublishClusterStateAction.java:202) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.PublishClusterStateAction.publish(PublishClusterStateAction.java:167) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.ZenDiscovery.publish(ZenDiscovery.java:311) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService.publishAndApplyChanges(ClusterService.java:741) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService.runTasks(ClusterService.java:587) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.run(ClusterService.java:263) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:247) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:210) [elasticsearch-5.5.0.jar:5.5.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
[2017-08-04T15:11:07,579][DEBUG][o.e.a.a.i.t.p.TransportPutIndexTemplateAction] [es1] timed out while retrying [indices:admin/template/put] after failure (timeout [30s])
org.elasticsearch.discovery.Discovery$FailedToCommitClusterStateException: timed out while waiting for enough masters to ack sent cluster state. [1] left
at org.elasticsearch.discovery.zen.PublishClusterStateAction$SendingController.waitForCommit(PublishClusterStateAction.java:574) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.PublishClusterStateAction.innerPublish(PublishClusterStateAction.java:202) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.PublishClusterStateAction.publish(PublishClusterStateAction.java:167) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.ZenDiscovery.publish(ZenDiscovery.java:311) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService.publishAndApplyChanges(ClusterService.java:741) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService.runTasks(ClusterService.java:587) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.run(ClusterService.java:263) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:247) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:210) [elasticsearch-5.5.0.jar:5.5.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
[2017-08-04T15:11:07,587][DEBUG][o.e.c.s.ClusterService ] [es1] processing [create-index-template [logstash], cause [api]]: took [30.5s] done applying updated cluster_state (version: 5, uuid: 44Lg-3eWQw6oN1lF6kfbBQ)
[2017-08-04T15:11:07,588][WARN ][o.e.c.s.ClusterService ] [es1] cluster state update task [create-index-template [logstash], cause [api]] took [30.5s] above the warn threshold of 30s
[2017-08-04T15:11:07,588][DEBUG][o.e.c.s.ClusterService ] [es1] processing [zen-disco-failed-to-publish]: execute
[2017-08-04T15:11:07,588][WARN ][o.e.d.z.ZenDiscovery ] [es1] zen-disco-failed-to-publish, current nodes: nodes:
{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}
{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}, local, master
{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}
[2017-08-04T15:11:07,589][DEBUG][o.e.c.s.ClusterService ] [es1] cluster state updated, version [4], source [zen-disco-failed-to-publish]
[2017-08-04T15:11:07,589][DEBUG][o.e.c.s.ClusterService ] [es1] applying cluster state version 4
[2017-08-04T15:11:07,589][DEBUG][o.e.c.s.ClusterService ] [es1] set local cluster state to version 4
[2017-08-04T15:11:07,590][DEBUG][o.e.l.LicenseService ] [es1] previous [{"uid":"4c1ea454-031b-4e1e-b50d-676cfc012f3b","type":"trial","issue_date_in_millis":1501856084707,"expiry_date_in_millis":1504448084707,"max_nodes":1000,"issued_to":"es-cluster","issuer":"elasticsearch","signature":"/////QAAAPCynqArHS76IEhLjg3dxaWbsDzKiSuaTSTaaq/ecm9rpGnvztb1ERevKoo2hnRTeuo074GopHnZNWoR80gyrvZlbXCxzq8YTt+zbs+ld5OxOZU+tz264/0dTZGpm4bAgx4mb7hPeKVPYXZ/WH6t088uGgJh8Y84T376tXpGlIHBpGEoZ/A0gToEBPCBBz5wqs2itiioE8Of+S/U17Iy9J24bgSV1UGq/dAS2vGxtwmDloQ+vq5NTkXKkegGGm5Bb5wbkxsS5nIJq9Y9pdJmFYSE2zmdNz52OZOm0UVf1gW7T8/JptXAkVmEQCbGMkcz7BA=","start_date_in_millis":-1}]
[2017-08-04T15:11:07,593][DEBUG][o.e.l.LicenseService ] [es1] current [{"uid":"4c1ea454-031b-4e1e-b50d-676cfc012f3b","type":"trial","issue_date_in_millis":1501856084707,"expiry_date_in_millis":1504448084707,"max_nodes":1000,"issued_to":"es-cluster","issuer":"elasticsearch","signature":"/////QAAAPCynqArHS76IEhLjg3dxaWbsDzKiSuaTSTaaq/ecm9rpGnvztb1ERevKoo2hnRTeuo074GopHnZNWoR80gyrvZlbXCxzq8YTt+zbs+ld5OxOZU+tz264/0dTZGpm4bAgx4mb7hPeKVPYXZ/WH6t088uGgJh8Y84T376tXpGlIHBpGEoZ/A0gToEBPCBBz5wqs2itiioE8Of+S/U17Iy9J24bgSV1UGq/dAS2vGxtwmDloQ+vq5NTkXKkegGGm5Bb5wbkxsS5nIJq9Y9pdJmFYSE2zmdNz52OZOm0UVf1gW7T8/JptXAkVmEQCbGMkcz7BA=","start_date_in_millis":-1}]
[2017-08-04T15:11:07,596][DEBUG][o.e.c.s.ClusterService ] [es1] processing [zen-disco-failed-to-publish]: took [7ms] done applying updated cluster_state (version: 4, uuid: wa16gWAEQ76p_bCu7rdugQ)
[2017-08-04T15:11:07,600][DEBUG][o.e.c.s.ClusterService ] [es1] processing [create-index-template [logstash], cause [api]]: execute
[2017-08-04T15:11:07,605][DEBUG][o.e.c.s.ClusterService ] [es1] failing [create-index-template [logstash], cause [api]]: local node is no longer master
[2017-08-04T15:11:07,611][DEBUG][o.e.a.a.i.t.p.TransportPutIndexTemplateAction] [es1] failed to put template [logstash]
org.elasticsearch.cluster.NotMasterException: no longer master. source: [create-index-template [logstash], cause [api]]
[2017-08-04T15:11:07,615][DEBUG][o.e.a.a.i.t.p.TransportPutIndexTemplateAction] [es1] master could not publish cluster state or stepped down before publishing action [indices:admin/template/put], scheduling a retry
org.elasticsearch.cluster.NotMasterException: no longer master. source: [create-index-template [logstash], cause [api]]
[2017-08-04T15:11:07,627][DEBUG][o.e.c.s.ClusterService ] [es1] processing [create-index-template [logstash], cause [api]]: execute
[2017-08-04T15:11:07,627][DEBUG][o.e.c.s.ClusterService ] [es1] failing [create-index-template [logstash], cause [api]]: local node is no longer master
[2017-08-04T15:11:07,627][DEBUG][o.e.a.a.i.t.p.TransportPutIndexTemplateAction] [es1] failed to put template [logstash]
org.elasticsearch.cluster.NotMasterException: no longer master. source: [create-index-template [logstash], cause [api]]
[2017-08-04T15:11:07,627][DEBUG][o.e.a.a.i.t.p.TransportPutIndexTemplateAction] [es1] master could not publish cluster state or stepped down before publishing action [indices:admin/template/put], scheduling a retry
org.elasticsearch.cluster.NotMasterException: no longer master. source: [create-index-template [logstash], cause [api]]
[2017-08-04T15:11:08,448][DEBUG][o.e.c.s.ClusterService ] [es1] processing [master ping (from: {es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300})]: execute
[2017-08-04T15:11:08,448][DEBUG][o.e.c.s.ClusterService ] [es1] failing [master ping (from: {es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300})]: local node is no longer master
[2017-08-04T15:11:08,513][DEBUG][o.e.c.s.ClusterService ] [es1] processing [master ping (from: {es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300})]: execute
[2017-08-04T15:11:08,513][DEBUG][o.e.c.s.ClusterService ] [es1] failing [master ping (from: {es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300})]: local node is no longer master
[2017-08-04T15:11:08,522][DEBUG][o.e.a.a.i.t.p.TransportPutIndexTemplateAction] [es1] timed out while retrying [indices:admin/template/put] after failure (timeout [30s])
org.elasticsearch.cluster.NotMasterException: no longer master. source: [create-index-template [logstash], cause [api]]
[2017-08-04T15:11:09,278][DEBUG][o.e.a.a.i.t.p.TransportPutIndexTemplateAction] [es1] timed out while retrying [indices:admin/template/put] after failure (timeout [30s])
org.elasticsearch.cluster.NotMasterException: no longer master. source: [create-index-template [logstash], cause [api]]
[2017-08-04T15:11:10,606][DEBUG][o.e.d.z.ZenDiscovery ] [es1] filtered ping responses: (ignore_non_masters [false])
--> ping_response{node [{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}], id[29], master [null],cluster_state_version [4], cluster_name[es-cluster]}
--> ping_response{node [{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}], id[29], master [null],cluster_state_version [4], cluster_name[es-cluster]}
--> ping_response{node [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}], id[36], master [null],cluster_state_version [4], cluster_name[es-cluster]}
[2017-08-04T15:11:10,609][DEBUG][o.e.d.z.ZenDiscovery ] [es1] elected as master, waiting for incoming joins ([1] needed)
[2017-08-04T15:11:11,513][DEBUG][o.e.c.s.ClusterService ] [es1] processing [zen-disco-elected-as-master ([1] nodes joined)[{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}]]: execute
[2017-08-04T15:11:11,513][DEBUG][o.e.d.z.NodeJoinController] [es1] received a join request for an existing node [{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}]
[2017-08-04T15:11:11,513][DEBUG][o.e.c.s.ClusterService ] [es1] cluster state updated, version [5], source [zen-disco-elected-as-master ([1] nodes joined)[{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}]]
[2017-08-04T15:11:11,513][INFO ][o.e.c.s.ClusterService ] [es1] new_master {es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}, reason: zen-disco-elected-as-master ([1] nodes joined)[{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}]
[2017-08-04T15:11:11,513][DEBUG][o.e.c.s.ClusterService ] [es1] publishing cluster state version [5]
[2017-08-04T15:11:29,277][DEBUG][o.e.a.a.c.s.TransportClusterStateAction] [es1] no known master node, scheduling a retry
[2017-08-04T15:11:41,515][DEBUG][o.e.d.z.ZenDiscovery ] [es1] failed to publish cluster state version [5] (not enough nodes acknowledged, min master nodes [2])
[2017-08-04T15:11:41,515][WARN ][o.e.c.s.ClusterService ] [es1] failing [zen-disco-elected-as-master ([1] nodes joined)[{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}]]: failed to commit cluster state version [5]
org.elasticsearch.discovery.Discovery$FailedToCommitClusterStateException: timed out while waiting for enough masters to ack sent cluster state. [1] left
at org.elasticsearch.discovery.zen.PublishClusterStateAction$SendingController.waitForCommit(PublishClusterStateAction.java:574) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.PublishClusterStateAction.innerPublish(PublishClusterStateAction.java:202) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.PublishClusterStateAction.publish(PublishClusterStateAction.java:167) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.ZenDiscovery.publish(ZenDiscovery.java:311) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService.publishAndApplyChanges(ClusterService.java:741) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService.runTasks(ClusterService.java:587) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.run(ClusterService.java:263) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:247) [elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:210) [elasticsearch-5.5.0.jar:5.5.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
[2017-08-04T15:11:41,524][DEBUG][o.e.c.s.ClusterService ] [es1] processing [zen-disco-elected-as-master ([1] nodes joined)[{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}]]: took [30s] done applying updated cluster_state (version: 5, uuid: 31UI45IzQnGX2ypMqCFchw)
[2017-08-04T15:11:41,526][WARN ][o.e.c.s.ClusterService ] [es1] cluster state update task [zen-disco-elected-as-master ([1] nodes joined)[{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}]] took [30s] above the warn threshold of 30s
[2017-08-04T15:11:41,527][DEBUG][o.e.c.s.ClusterService ] [es1] processing [zen-disco-failed-to-publish]: execute
[2017-08-04T15:11:41,527][WARN ][o.e.d.z.ZenDiscovery ] [es1] zen-disco-failed-to-publish, current nodes: nodes:
{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}
{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}, local
{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}
[2017-08-04T15:11:41,528][DEBUG][o.e.c.s.ClusterService ] [es1] processing [zen-disco-failed-to-publish]: took [0s] no change in cluster_state
[2017-08-04T15:11:41,528][DEBUG][o.e.c.s.ClusterService ] [es1] processing [zen-disco-node-join[{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}]]: execute
[2017-08-04T15:11:41,545][DEBUG][o.e.c.s.ClusterService ] [es1] processing [zen-disco-node-join[{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}]]: took [16ms] no change in cluster_state
[2017-08-04T15:11:44,517][DEBUG][o.e.d.z.ZenDiscovery ] [es1] filtered ping responses: (ignore_non_masters [false])
--> ping_response{node [{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}], id[41], master [null],cluster_state_version [4], cluster_name[es-cluster]}
--> ping_response{node [{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}], id[44], master [null],cluster_state_version [4], cluster_name[es-cluster]}
--> ping_response{node [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}], id[49], master [null],cluster_state_version [4], cluster_name[es-cluster]}
[2017-08-04T15:11:44,518][DEBUG][o.e.d.z.ZenDiscovery ] [es1] elected as master, waiting for incoming joins ([1] needed)
[2017-08-04T15:11:44,519][DEBUG][o.e.c.s.ClusterService ] [es1] processing [zen-disco-elected-as-master ([1] nodes joined)[{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}]]: execute
[2017-08-04T15:11:44,520][DEBUG][o.e.d.z.NodeJoinController] [es1] received a join request for an existing node [{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}]
[2017-08-04T15:11:44,522][DEBUG][o.e.c.s.ClusterService ] [es1] cluster state updated, version [5], source [zen-disco-elected-as-master ([1] nodes joined)[{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}]]
[2017-08-04T15:11:44,523][INFO ][o.e.c.s.ClusterService ] [es1] new_master {es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}, reason: zen-disco-elected-as-master ([1] nodes joined)[{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}]
[2017-08-04T15:11:44,523][DEBUG][o.e.c.s.ClusterService ] [es1] publishing cluster state version [5]
[2017-08-04T15:11:53,984][DEBUG][o.e.a.a.c.s.TransportClusterStateAction] [es1] no known master node, scheduling a retry
node2:
[2017-08-04T15:11:07,510][WARN ][r.suppressed ] path: /_template/logstash, params: {name=logstash}
org.elasticsearch.transport.RemoteTransportException: [es1][10.10.0.3:9300][indices:admin/template/put]
Caused by: org.elasticsearch.discovery.MasterNotDiscoveredException: FailedToCommitClusterStateException[timed out while waiting for enough masters to ack sent cluster state. [1] left]
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:209) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:311) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.ClusterStateObserver.waitForNextChange(ClusterStateObserver.java:139) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.ClusterStateObserver.waitForNextChange(ClusterStateObserver.java:111) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction.retry(TransportMasterNodeAction.java:194) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction.access$500(TransportMasterNodeAction.java:107) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$1.onFailure(TransportMasterNodeAction.java:157) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.action.admin.indices.template.put.TransportPutIndexTemplateAction$1.onFailure(TransportPutIndexTemplateAction.java:101) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.metadata.MetaDataIndexTemplateService$2.onFailure(MetaDataIndexTemplateService.java:163) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService$SafeClusterStateTaskListener.onFailure(ClusterService.java:952) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService$TaskOutputs.lambda$publishingFailed$0(ClusterService.java:865) ~[elasticsearch-5.5.0.jar:5.5.0]
at java.util.ArrayList.forEach(ArrayList.java:1249) ~[?:1.8.0_131]
at org.elasticsearch.cluster.service.ClusterService$TaskOutputs.publishingFailed(ClusterService.java:865) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService.publishAndApplyChanges(ClusterService.java:751) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService.runTasks(ClusterService.java:587) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.run(ClusterService.java:263) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:247) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:210) ~[elasticsearch-5.5.0.jar:5.5.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
Caused by: org.elasticsearch.discovery.Discovery$FailedToCommitClusterStateException: timed out while waiting for enough masters to ack sent cluster state. [1] left
at org.elasticsearch.discovery.zen.PublishClusterStateAction$SendingController.waitForCommit(PublishClusterStateAction.java:574) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.PublishClusterStateAction.innerPublish(PublishClusterStateAction.java:202) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.PublishClusterStateAction.publish(PublishClusterStateAction.java:167) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.discovery.zen.ZenDiscovery.publish(ZenDiscovery.java:311) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService.publishAndApplyChanges(ClusterService.java:741) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService.runTasks(ClusterService.java:587) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService$ClusterServiceTaskBatcher.run(ClusterService.java:263) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:247) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:210) ~[elasticsearch-5.5.0.jar:5.5.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_131]
[2017-08-04T15:11:08,433][DEBUG][o.e.d.z.MasterFaultDetection] [es2] [master] pinging a master {es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300} that is no longer a master
[2017-08-04T15:11:08,435][INFO ][o.e.d.z.ZenDiscovery ] [es2] master_left [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}], reason [no longer master]
org.elasticsearch.transport.RemoteTransportException: [es1][10.10.0.3:9300][internal:discovery/zen/fd/master_ping]
Caused by: org.elasticsearch.cluster.NotMasterException: local node is not master
[2017-08-04T15:11:08,441][DEBUG][o.e.c.s.ClusterService ] [es2] processing [master_failed ({es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300})]: execute
[2017-08-04T15:11:08,442][WARN ][o.e.d.z.ZenDiscovery ] [es2] master left (reason = no longer master), current nodes: nodes:
{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}, master
{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}
{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}, local
[2017-08-04T15:11:08,442][DEBUG][o.e.d.z.MasterFaultDetection] [es2] [master] stopping fault detection against master [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}], reason [master failure, no longer master]
[2017-08-04T15:11:08,443][DEBUG][o.e.c.s.ClusterService ] [es2] cluster state updated, version [4], source [master_failed ({es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300})]
[2017-08-04T15:11:08,448][DEBUG][o.e.c.s.ClusterService ] [es2] applying cluster state version 4
[2017-08-04T15:11:08,460][DEBUG][o.e.c.s.ClusterService ] [es2] set local cluster state to version 4
[2017-08-04T15:11:08,464][DEBUG][o.e.l.LicenseService ] [es2] previous [{"uid":"4c1ea454-031b-4e1e-b50d-676cfc012f3b","type":"trial","issue_date_in_millis":1501856084707,"expiry_date_in_millis":1504448084707,"max_nodes":1000,"issued_to":"es-cluster","issuer":"elasticsearch","signature":"/////QAAAPCynqArHS76IEhLjg3dxaWbsDzKiSuaTSTaaq/ecm9rpGnvztb1ERevKoo2hnRTeuo074GopHnZNWoR80gyrvZlbXCxzq8YTt+zbs+ld5OxOZU+tz264/0dTZGpm4bAgx4mb7hPeKVPYXZ/WH6t088uGgJh8Y84T376tXpGlIHBpGEoZ/A0gToEBPCBBz5wqs2itiioE8Of+S/U17Iy9J24bgSV1UGq/dAS2vGxtwmDloQ+vq5NTkXKkegGGm5Bb5wbkxsS5nIJq9Y9pdJmFYSE2zmdNz52OZOm0UVf1gW7T8/JptXAkVmEQCbGMkcz7BA=","start_date_in_millis":-1}]
[2017-08-04T15:11:08,468][DEBUG][o.e.l.LicenseService ] [es2] current [{"uid":"4c1ea454-031b-4e1e-b50d-676cfc012f3b","type":"trial","issue_date_in_millis":1501856084707,"expiry_date_in_millis":1504448084707,"max_nodes":1000,"issued_to":"es-cluster","issuer":"elasticsearch","signature":"/////QAAAPCynqArHS76IEhLjg3dxaWbsDzKiSuaTSTaaq/ecm9rpGnvztb1ERevKoo2hnRTeuo074GopHnZNWoR80gyrvZlbXCxzq8YTt+zbs+ld5OxOZU+tz264/0dTZGpm4bAgx4mb7hPeKVPYXZ/WH6t088uGgJh8Y84T376tXpGlIHBpGEoZ/A0gToEBPCBBz5wqs2itiioE8Of+S/U17Iy9J24bgSV1UGq/dAS2vGxtwmDloQ+vq5NTkXKkegGGm5Bb5wbkxsS5nIJq9Y9pdJmFYSE2zmdNz52OZOm0UVf1gW7T8/JptXAkVmEQCbGMkcz7BA=","start_date_in_millis":-1}]
[2017-08-04T15:11:08,468][DEBUG][o.e.c.s.ClusterService ] [es2] processing [master_failed ({es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300})]: took [27ms] done applying updated cluster_state (version: 4, uuid: wa16gWAEQ76p_bCu7rdugQ)
[2017-08-04T15:11:09,194][WARN ][r.suppressed ] path: /_template/logstash, params: {name=logstash}
org.elasticsearch.transport.RemoteTransportException: [es1][10.10.0.3:9300][indices:admin/template/put]
Caused by: org.elasticsearch.discovery.MasterNotDiscoveredException: NotMasterException[no longer master. source: [create-index-template [logstash], cause [api]]]
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:209) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:311) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:238) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService$NotifyTimeout.run(ClusterService.java:1056) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) ~[elasticsearch-5.5.0.jar:5.5.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
Caused by: org.elasticsearch.cluster.NotMasterException: no longer master. source: [create-index-template [logstash], cause [api]]
[2017-08-04T15:11:11,469][DEBUG][o.e.d.z.ZenDiscovery ] [es2] filtered ping responses: (ignore_non_masters [false])
--> ping_response{node [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}], id[35], master [null],cluster_state_version [4], cluster_name[es-cluster]}
--> ping_response{node [{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}], id[32], master [null],cluster_state_version [4], cluster_name[es-cluster]}
--> ping_response{node [{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}], id[31], master [null],cluster_state_version [4], cluster_name[es-cluster]}
[2017-08-04T15:11:11,475][DEBUG][o.e.c.s.ClusterService ] [es2] processing [zen-disco-election-stop [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300} elected]]: execute
[2017-08-04T15:11:11,476][DEBUG][o.e.c.s.ClusterService ] [es2] processing [zen-disco-election-stop [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300} elected]]: took [0s] no change in cluster_state
node3:
[2017-08-04T15:11:08,426][DEBUG][o.e.d.z.MasterFaultDetection] [es3] [master] pinging a master {es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300} that is no longer a master
[2017-08-04T15:11:08,427][DEBUG][o.e.d.z.MasterFaultDetection] [es3] [master] stopping fault detection against master [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}], reason [master failure, no longer master]
[2017-08-04T15:11:08,435][INFO ][o.e.d.z.ZenDiscovery ] [es3] master_left [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}], reason [no longer master]
org.elasticsearch.transport.RemoteTransportException: [es1][10.10.0.3:9300][internal:discovery/zen/fd/master_ping]
Caused by: org.elasticsearch.cluster.NotMasterException: local node is not master
[2017-08-04T15:11:08,447][DEBUG][o.e.c.s.ClusterService ] [es3] processing [master_failed ({es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300})]: execute
[2017-08-04T15:11:08,450][WARN ][o.e.d.z.ZenDiscovery ] [es3] master left (reason = no longer master), current nodes: nodes:
{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}
{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}, master
{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}, local
[2017-08-04T15:11:08,457][DEBUG][o.e.c.s.ClusterService ] [es3] cluster state updated, version [4], source [master_failed ({es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300})]
[2017-08-04T15:11:08,462][DEBUG][o.e.c.s.ClusterService ] [es3] applying cluster state version 4
[2017-08-04T15:11:08,464][DEBUG][o.e.c.s.ClusterService ] [es3] set local cluster state to version 4
[2017-08-04T15:11:08,469][DEBUG][o.e.l.LicenseService ] [es3] previous [{"uid":"4c1ea454-031b-4e1e-b50d-676cfc012f3b","type":"trial","issue_date_in_millis":1501856084707,"expiry_date_in_millis":1504448084707,"max_nodes":1000,"issued_to":"es-cluster","issuer":"elasticsearch","signature":"/////QAAAPCynqArHS76IEhLjg3dxaWbsDzKiSuaTSTaaq/ecm9rpGnvztb1ERevKoo2hnRTeuo074GopHnZNWoR80gyrvZlbXCxzq8YTt+zbs+ld5OxOZU+tz264/0dTZGpm4bAgx4mb7hPeKVPYXZ/WH6t088uGgJh8Y84T376tXpGlIHBpGEoZ/A0gToEBPCBBz5wqs2itiioE8Of+S/U17Iy9J24bgSV1UGq/dAS2vGxtwmDloQ+vq5NTkXKkegGGm5Bb5wbkxsS5nIJq9Y9pdJmFYSE2zmdNz52OZOm0UVf1gW7T8/JptXAkVmEQCbGMkcz7BA=","start_date_in_millis":-1}]
[2017-08-04T15:11:08,477][DEBUG][o.e.l.LicenseService ] [es3] current [{"uid":"4c1ea454-031b-4e1e-b50d-676cfc012f3b","type":"trial","issue_date_in_millis":1501856084707,"expiry_date_in_millis":1504448084707,"max_nodes":1000,"issued_to":"es-cluster","issuer":"elasticsearch","signature":"/////QAAAPCynqArHS76IEhLjg3dxaWbsDzKiSuaTSTaaq/ecm9rpGnvztb1ERevKoo2hnRTeuo074GopHnZNWoR80gyrvZlbXCxzq8YTt+zbs+ld5OxOZU+tz264/0dTZGpm4bAgx4mb7hPeKVPYXZ/WH6t088uGgJh8Y84T376tXpGlIHBpGEoZ/A0gToEBPCBBz5wqs2itiioE8Of+S/U17Iy9J24bgSV1UGq/dAS2vGxtwmDloQ+vq5NTkXKkegGGm5Bb5wbkxsS5nIJq9Y9pdJmFYSE2zmdNz52OZOm0UVf1gW7T8/JptXAkVmEQCbGMkcz7BA=","start_date_in_millis":-1}]
[2017-08-04T15:11:08,481][DEBUG][o.e.c.s.ClusterService ] [es3] processing [master_failed ({es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300})]: took [30ms] done applying updated cluster_state (version: 4, uuid: wa16gWAEQ76p_bCu7rdugQ)
[2017-08-04T15:11:08,503][WARN ][r.suppressed ] path: /_template/logstash, params: {name=logstash}
org.elasticsearch.transport.RemoteTransportException: [es1][10.10.0.3:9300][indices:admin/template/put]
Caused by: org.elasticsearch.discovery.MasterNotDiscoveredException: NotMasterException[no longer master. source: [create-index-template [logstash], cause [api]]]
at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$4.onTimeout(TransportMasterNodeAction.java:209) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.ClusterStateObserver$ContextPreservingListener.onTimeout(ClusterStateObserver.java:311) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:238) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.cluster.service.ClusterService$NotifyTimeout.run(ClusterService.java:1056) ~[elasticsearch-5.5.0.jar:5.5.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) ~[elasticsearch-5.5.0.jar:5.5.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_131]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_131]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
Caused by: org.elasticsearch.cluster.NotMasterException: no longer master. source: [create-index-template [logstash], cause [api]]
[2017-08-04T15:11:11,478][DEBUG][o.e.d.z.ZenDiscovery ] [es3] filtered ping responses: (ignore_non_masters [false])
--> ping_response{node [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}], id[33], master [null],cluster_state_version [4], cluster_name[es-cluster]}
--> ping_response{node [{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}], id[29], master [null],cluster_state_version [4], cluster_name[es-cluster]}
--> ping_response{node [{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}], id[33], master [null],cluster_state_version [4], cluster_name[es-cluster]}
[2017-08-04T15:11:11,479][DEBUG][o.e.c.s.ClusterService ] [es3] processing [zen-disco-election-stop [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300} elected]]: execute
[2017-08-04T15:11:11,479][DEBUG][o.e.c.s.ClusterService ] [es3] processing [zen-disco-election-stop [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300} elected]]: took [0s] no change in cluster_state
[2017-08-04T15:11:41,494][INFO ][o.e.d.z.ZenDiscovery ] [es3] failed to send join request to master [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}], reason [RemoteTransportException[[es1][10.10.0.3:9300][internal:discovery/zen/join]]; nested: FailedToCommitClusterStateException[timed out while waiting for enough masters to ack sent cluster state. [1] left]; ]
[2017-08-04T15:11:41,496][DEBUG][o.e.c.s.ClusterService ] [es3] processing [finalize_join ({es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300})]: execute
[2017-08-04T15:11:41,496][DEBUG][o.e.c.s.ClusterService ] [es3] processing [finalize_join ({es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300})]: took [0s] no change in cluster_state
[2017-08-04T15:11:44,497][DEBUG][o.e.d.z.ZenDiscovery ] [es3] filtered ping responses: (ignore_non_masters [false])
--> ping_response{node [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300}], id[48], master [null],cluster_state_version [4], cluster_name[es-cluster]}
--> ping_response{node [{es2}{98BIV3ZuQOeOtfVj_2KJcQ}{TFiTKrACQLKJUNKnW1prCQ}{10.10.0.4}{10.10.0.4:9300}], id[43], master [null],cluster_state_version [4], cluster_name[es-cluster]}
--> ping_response{node [{es3}{eeeJqvs-SYalWKLi59ltPQ}{JEKCM_ecREuvypXt4vmHEQ}{10.10.0.6}{10.10.0.6:9300}], id[46], master [null],cluster_state_version [4], cluster_name[es-cluster]}
[2017-08-04T15:11:44,499][DEBUG][o.e.c.s.ClusterService ] [es3] processing [zen-disco-election-stop [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300} elected]]: execute
[2017-08-04T15:11:44,501][DEBUG][o.e.c.s.ClusterService ] [es3] processing [zen-disco-election-stop [{es1}{5xCwug_7QduQ15lhb9Gmhw}{zxhvNLd4QpW6nPaw61eUGg}{10.10.0.2}{10.10.0.2:9300} elected]]: took [0s] no change in cluster_state
Hi everybody! I can't deploy elasticsearch cluster. Tools: Docker version 17.06.1-ce, build 874a737
version: '3.3'
services:
elasticsearch:
image: elasticsearch:alpine
ports:
- '9200:9200'
- '9300:9300'
command: [ elasticsearch, -E, network.host=0.0.0.0, -E, discovery.zen.ping.unicast.hosts=elasticsearch, -E, discovery.zen.minimum_master_nodes=1, -E, cluster.name=mycluster ]
networks:
- esnet1
environment:
ES_JAVA_OPTS: "-Xmx512m -Xms512m"
deploy:
mode: replicated
replicas: 2
#endpoint_mode: dnsrr
resources:
limits:
cpus: '2'
memory: 1024M
reservations:
cpus: '0.50'
memory: 512M
networks:
esnet1:
Log service:
[2017-08-18T21:51:41,343][INFO ][o.e.n.Node ] [] initializing ...
[2017-08-18T21:51:41,448][INFO ][o.e.e.NodeEnvironment ] [KBs18kt] using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/mapper/server2--vg-root)]], net usable_space [1.6tb], net total_space [1.7tb], spins? [possibly], types [ext4]
[2017-08-18T21:51:41,448][INFO ][o.e.e.NodeEnvironment ] [KBs18kt] heap size [494.9mb], compressed ordinary object pointers [true]
[2017-08-18T21:51:41,449][INFO ][o.e.n.Node ] node name [KBs18kt] derived from node ID [KBs18ktkTqCla61SlehVPA]; set [node.name] to override
[2017-08-18T21:51:41,449][INFO ][o.e.n.Node ] version[5.5.1], pid[1], build[19c13d0/2017-07-18T20:44:24.823Z], OS[Linux/4.4.0-91-generic/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_131/25.131-b11]
[2017-08-18T21:51:41,450][INFO ][o.e.n.Node ] JVM arguments [-Xms2g, -Xmx2g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -Djdk.io.permissionsUseCanonicalPath=true, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Dlog4j.skipJansi=true, -XX:+HeapDumpOnOutOfMemoryError, -Xmx512m, -Xms512m, -Des.path.home=/usr/share/elasticsearch]
[2017-08-18T21:51:42,749][INFO ][o.e.p.PluginsService ] [KBs18kt] loaded module [aggs-matrix-stats]
[2017-08-18T21:51:42,749][INFO ][o.e.p.PluginsService ] [KBs18kt] loaded module [ingest-common]
[2017-08-18T21:51:42,749][INFO ][o.e.p.PluginsService ] [KBs18kt] loaded module [lang-expression]
[2017-08-18T21:51:42,749][INFO ][o.e.p.PluginsService ] [KBs18kt] loaded module [lang-groovy]
[2017-08-18T21:51:42,749][INFO ][o.e.p.PluginsService ] [KBs18kt] loaded module [lang-mustache]
[2017-08-18T21:51:42,749][INFO ][o.e.p.PluginsService ] [KBs18kt] loaded module [lang-painless]
[2017-08-18T21:51:42,749][INFO ][o.e.p.PluginsService ] [KBs18kt] loaded module [parent-join]
[2017-08-18T21:51:42,749][INFO ][o.e.p.PluginsService ] [KBs18kt] loaded module [percolator]
[2017-08-18T21:51:42,749][INFO ][o.e.p.PluginsService ] [KBs18kt] loaded module [reindex]
[2017-08-18T21:51:42,750][INFO ][o.e.p.PluginsService ] [KBs18kt] loaded module [transport-netty3]
[2017-08-18T21:51:42,750][INFO ][o.e.p.PluginsService ] [KBs18kt] loaded module [transport-netty4]
[2017-08-18T21:51:42,750][INFO ][o.e.p.PluginsService ] [KBs18kt] no plugins loaded
[2017-08-18T21:51:44,869][INFO ][o.e.d.DiscoveryModule ] [KBs18kt] using discovery type [zen]
[2017-08-18T21:51:45,900][INFO ][o.e.n.Node ] initialized
[2017-08-18T21:51:45,901][INFO ][o.e.n.Node ] [KBs18kt] starting ...
[2017-08-18T21:51:46,013][INFO ][o.e.t.TransportService ] [KBs18kt] publish_address {10.0.4.2:9300}, bound_addresses {0.0.0.0:9300}
[2017-08-18T21:51:46,020][INFO ][o.e.b.BootstrapChecks ] [KBs18kt] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-08-18T21:51:49,056][INFO ][o.e.c.s.ClusterService ] [KBs18kt] new_master {KBs18kt}{KBs18ktkTqCla61SlehVPA}{7wr2u2qjT8i1pyugtDmviA}{10.0.4.2}{10.0.4.2:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-08-18T21:51:49,135][INFO ][o.e.h.n.Netty4HttpServerTransport] [KBs18kt] publish_address {10.0.4.2:9200}, bound_addresses {0.0.0.0:9200}
[2017-08-18T21:51:49,135][INFO ][o.e.n.Node ] [KBs18kt] started
[2017-08-18T21:51:49,188][INFO ][o.e.g.GatewayService ] [KBs18kt] recovered [0] indices into cluster_state
[2017-08-18T21:51:40,759][INFO ][o.e.n.Node ] [] initializing ...
[2017-08-18T21:51:40,862][INFO ][o.e.e.NodeEnvironment ] [xKEFl_q] using [1] data paths, mounts [[/usr/share/elasticsearch/data (/dev/sda1)]], net usable_space [78.8gb], net total_space [101.5gb], spins? [possibly], types [ext4]
[2017-08-18T21:51:40,863][INFO ][o.e.e.NodeEnvironment ] [xKEFl_q] heap size [494.9mb], compressed ordinary object pointers [true]
[2017-08-18T21:51:40,865][INFO ][o.e.n.Node ] node name [xKEFl_q] derived from node ID [xKEFl_q-Q7a4IKiF2NrXJw]; set [node.name] to override
[2017-08-18T21:51:40,865][INFO ][o.e.n.Node ] version[5.5.1], pid[1], build[19c13d0/2017-07-18T20:44:24.823Z], OS[Linux/4.8.0-53-generic/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/1.8.0_131/25.131-b11]
[2017-08-18T21:51:40,866][INFO ][o.e.n.Node ] JVM arguments [-Xms2g, -Xmx2g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -XX:+AlwaysPreTouch, -Xss1m, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djna.nosys=true, -Djdk.io.permissionsUseCanonicalPath=true, -Dio.netty.noUnsafe=true, -Dio.netty.noKeySetOptimization=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Dlog4j.shutdownHookEnabled=false, -Dlog4j2.disable.jmx=true, -Dlog4j.skipJansi=true, -XX:+HeapDumpOnOutOfMemoryError, -Xmx512m, -Xms512m, -Des.path.home=/usr/share/elasticsearch]
[2017-08-18T21:51:42,763][INFO ][o.e.p.PluginsService ] [xKEFl_q] loaded module [aggs-matrix-stats]
[2017-08-18T21:51:42,763][INFO ][o.e.p.PluginsService ] [xKEFl_q] loaded module [ingest-common]
[2017-08-18T21:51:42,763][INFO ][o.e.p.PluginsService ] [xKEFl_q] loaded module [lang-expression]
[2017-08-18T21:51:42,763][INFO ][o.e.p.PluginsService ] [xKEFl_q] loaded module [lang-groovy]
[2017-08-18T21:51:42,763][INFO ][o.e.p.PluginsService ] [xKEFl_q] loaded module [lang-mustache]
[2017-08-18T21:51:42,763][INFO ][o.e.p.PluginsService ] [xKEFl_q] loaded module [lang-painless]
[2017-08-18T21:51:42,763][INFO ][o.e.p.PluginsService ] [xKEFl_q] loaded module [parent-join]
[2017-08-18T21:51:42,764][INFO ][o.e.p.PluginsService ] [xKEFl_q] loaded module [percolator]
[2017-08-18T21:51:42,764][INFO ][o.e.p.PluginsService ] [xKEFl_q] loaded module [reindex]
[2017-08-18T21:51:42,764][INFO ][o.e.p.PluginsService ] [xKEFl_q] loaded module [transport-netty3]
[2017-08-18T21:51:42,764][INFO ][o.e.p.PluginsService ] [xKEFl_q] loaded module [transport-netty4]
[2017-08-18T21:51:42,764][INFO ][o.e.p.PluginsService ] [xKEFl_q] no plugins loaded
[2017-08-18T21:51:45,470][INFO ][o.e.d.DiscoveryModule ] [xKEFl_q] using discovery type [zen]
[2017-08-18T21:51:46,902][INFO ][o.e.n.Node ] initialized
[2017-08-18T21:51:46,902][INFO ][o.e.n.Node ] [xKEFl_q] starting ...
[2017-08-18T21:51:47,060][INFO ][o.e.t.TransportService ] [xKEFl_q] publish_address {10.0.4.2:9300}, bound_addresses {0.0.0.0:9300}
[2017-08-18T21:51:47,072][INFO ][o.e.b.BootstrapChecks ] [xKEFl_q] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
[2017-08-18T21:51:50,133][INFO ][o.e.c.s.ClusterService ] [xKEFl_q] new_master {xKEFl_q}{xKEFl_q-Q7a4IKiF2NrXJw}{CnefaO9nTFaGjpg9EzI5xA}{10.0.4.2}{10.0.4.2:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
[2017-08-18T21:51:50,153][INFO ][o.e.h.n.Netty4HttpServerTransport] [xKEFl_q] publish_address {10.0.4.2:9200}, bound_addresses {0.0.0.0:9200}
[2017-08-18T21:51:50,154][INFO ][o.e.n.Node ] [xKEFl_q] started
[2017-08-18T21:51:50,163][INFO ][o.e.g.GatewayService ] [xKEFl_q] recovered [0] indices into cluster_state
Two instances of elasticsearch do not see each other. What am I doing wrong?
@IvanBiv the VIP on loopback fix is not in the 17.06 train. Being a new feature will be in the next release
@fcrisciani I did not find a discussion of this problem at https://github.com/docker/docker-ce How soon to wait for the release with the addition of this feature?
@muresan, @fcrisciani I have implemented @muresan dsnrr mode by referencing your post.
But the problem I have is my requirement is for "discovery.zen.minimum_master_nodes=2"
Since I have defined minimum master nodes as 2 even after 2 replica node started they can't identify each other. ZenDiscovery(org.elasticsearch.discovery.MasterNotDiscoveredException) on each node frequently tries for 2nd master ping and fails. May be due to network issues?
After adding below network host both nodes joins cluster network.host: 0.0.0.0 Is it the right way of doing it?
Even then I have issues like, For all existing indexes - DanglingIndicesState warning can not be imported as a dangling index, as index with same name already exists in cluster metadata
Please give your suggestion @muresan, @fcrisciani
@erdarun yes you need network.host: 0.0.0.0
or anything similar because default is network.host: _local_
which binds to loopback.
DanglingIndicesState - I'm assuming this is from the fact that volumes are present from previous stacks being created/deleted. You should not see that after a clean deploy (remove all volumes from all swarm nodes, unless you can ensure that the same container will always get the same volume.)
@IvanBiv most likely will be 17.09 train that will arrive in September, will confirm as I have more details
@fcrisciani thank you it is good news! If you will have news please write here, easy way for ES cluster it is very warently!
@IvanBiv just wanted to confirm that docker 17.09.0-ce-rc1
available on the testing channel contains already the fix and allow the deploy of the elasticsearch cluster as a docker swarm service.
I tested again the example here: https://github.com/elastic/elasticsearch-docker/issues/91#issuecomment-319698631
If you try the same example use: sudo docker stack deploy -c compose.yml dev
the name has to be dev to match the tasks.dev_elasticsearch
@fcrisciani, Thanks! It is work!
@fcrisciani Thanks a lot for the update. The example seems to work for me locally but once on the swarm mode, my elasticsearch service would have the error "invalid mount config for type "bind": bind source path does not exist"
.
I am curious whether swarm mode works for you, @IvanBiv ?
@shawnpanda I think you need update docker on hosts. I have ES as distributed cluster.
@fcrisciani If I use other stack name it don't work. Work:
version: "3.3"
services:
elasticsearch:
image: elasticsearch:alpine
command: [ elasticsearch, -E, "network.host=_eth0:ipv4_", -E, discovery.zen.ping.unicast.hosts=tasks.dev_elasticsearch, -E, discovery.zen.minimum_master_nodes=2, -E, cluster.name=myclustersss ]
ports:
- "9200:9200"
- "9300:9300"
environment:
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
#volumes:
# - ./elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
networks:
- backend
deploy:
replicas: 3
kibana:
image: kibana
ports:
- "5601:5601"
networks:
- backend
networks:
backend:
attachable: true
sudo docker stack deploy -c compose.yml dev
Don't work:
version: "3.3"
services:
elasticsearch:
image: elasticsearch:alpine
command: [ elasticsearch, -E, "network.host=_eth0:ipv4_", -E, discovery.zen.ping.unicast.hosts=tasks.sss_elasticsearch, -E, discovery.zen.minimum_master_nodes=2, -E, cluster.name=myclustersss ]
ports:
- "9200:9200"
- "9300:9300"
environment:
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
#volumes:
# - ./elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
networks:
- backend
deploy:
replicas: 3
kibana:
image: kibana
ports:
- "5601:5601"
networks:
- backend
networks:
backend:
attachable: true
sudo docker stack deploy -c compose.yml sss
@shawnpanda most likely the problem that you are hitting is because you need the elasticsearch.yml configuration on all the nodes, as you can see in the compose file for convenience I'm mounting that as a volume so the task to be spawn correctly needs that file to be present. The other option is to do like @IvanBiv that passes the config as arguments in the launch command.
@IvanBiv I replicate the same test and the reason it is not working is that the overlay interface is not anymore eth0 but eth2. What I did is the following: docker inspect <id container elasticsearch> -f {{.NetworkSettings.Networks.sss_backend}}
the output is like:
{0xc42042e580 [] [4c72cefee3a3] 47w50i4m9m0kd3m1wo2dawwi5 8e0ebe6191b6440e2db2caddcbae0af2e56f4bd5bd71c64510071e3bdf56e7a3 10.0.0.3 24 0 02:42:0a:00:00:03 map[]}
You want to select the interface inside the container that has this ip and mac: 10.0.0.3 or 02:42:0a:00:00:03 Today we don't have a real way to select the interface name or to guarantee the interface creation order, maybe this would be next step to make things better.
@fcrisciani Thanks, but I didn't understand can I use other Stack name or only "dev"?
@IvanBiv yes try to change the "network.host=_eth0:ipv4_"
to "network.host=_eth2:ipv4_"
and verify that eth2 is actually the interface of the overlay network sss_backend. If so that will work.
Interfaces are created from networks and they are ordered in lexicographical order so sss now is the last one :) for this reason from eth0 is now eth2, if you deploy it as aaa would be again eth0
@fcrisciani ok, I unedrstood you. I will try
@fcristiani @IvanBiv you don't need to use discovery.zen.ping.unicast.hosts=tasks.dev_elasticsearch
it will work with just tasks.<servicename>
no need to add stack name. That keeps the compose file independent of the name of the stack. Now the only problem remaining is the order of the interfaces. Hint: https://github.com/docker/libnetwork/issues/1888 :)
@fcrisciani Hi, i test your compose (i have changed only something to fit well my needs) and everything works OK. But... if i insert the healthcheck block it stops working:
version: "3.3"
services:
elasticsearch:
image: elasticsearch:alpine
command: [ elasticsearch, -E, "network.host=_eth0:ipv4_", -E, discovery.zen.ping.unicast.hosts=tasks.elasticsearch, -E, discovery.zen.minimum_master_nodes=2, -E, cluster.name=es-cluster ]
ports:
- "9200:9200"
- "9300:9300"
environment:
- "ES_JAVA_OPTS=-Xms512m -Xmx512m"
networks:
- backend
healthcheck:
test: ping -c1 localhost >/dev/null 2>&1 || exit 1
interval: 1m
timeout: 10s
retries: 3
deploy:
mode: global
networks:
backend:
attachable: true
Container log:
...
dev_elasticsearch.0.m1g998oi5rvo@docker3 | [2017-10-02T22:16:21,176][INFO ][o.e.n.Node ] initialized
dev_elasticsearch.0.m1g998oi5rvo@docker3 | [2017-10-02T22:16:21,176][INFO ][o.e.n.Node ] [qJKdehi] starting ...
dev_elasticsearch.0.m1g998oi5rvo@docker3 | [2017-10-02T22:16:21,254][INFO ][o.e.t.TransportService ] [qJKdehi] publish_address {10.0.0.5:9300}, bound_addresses {10.0.0.5:9300}
dev_elasticsearch.0.m1g998oi5rvo@docker3 | [2017-10-02T22:16:21,260][INFO ][o.e.b.BootstrapChecks ] [qJKdehi] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
dev_elasticsearch.0.m1g998oi5rvo@docker3 | [2017-10-02T22:16:21,280][WARN ][o.e.d.z.UnicastZenPing ] [qJKdehi] failed to resolve host [tasks.elasticsearch]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | java.net.UnknownHostException: tasks.elasticsearch: Name does not resolve
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) ~[?:1.8.0_131]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) ~[?:1.8.0_131]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) ~[?:1.8.0_131]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at java.net.InetAddress.getAllByName0(InetAddress.java:1276) ~[?:1.8.0_131]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at java.net.InetAddress.getAllByName(InetAddress.java:1192) ~[?:1.8.0_131]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at java.net.InetAddress.getAllByName(InetAddress.java:1126) ~[?:1.8.0_131]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at org.elasticsearch.transport.TcpTransport.parse(TcpTransport.java:921) ~[elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at org.elasticsearch.transport.TcpTransport.addressesFromString(TcpTransport.java:876) ~[elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at org.elasticsearch.transport.TransportService.addressesFromString(TransportService.java:691) ~[elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at org.elasticsearch.discovery.zen.UnicastZenPing.lambda$null$0(UnicastZenPing.java:212) ~[elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_131]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
dev_elasticsearch.0.m1g998oi5rvo@docker3 | at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | [2017-10-02T22:16:21,309][INFO ][o.e.n.Node ] initialized
dev_elasticsearch.0.wjr53bkc7plt@docker2 | [2017-10-02T22:16:21,310][INFO ][o.e.n.Node ] [c1rMKJf] starting ...
dev_elasticsearch.0.1b8miso3adcm@docker1 | [2017-10-02T22:16:21,385][INFO ][o.e.n.Node ] initialized
dev_elasticsearch.0.1b8miso3adcm@docker1 | [2017-10-02T22:16:21,385][INFO ][o.e.n.Node ] [gAPXef-] starting ...
dev_elasticsearch.0.wjr53bkc7plt@docker2 | [2017-10-02T22:16:21,401][INFO ][o.e.t.TransportService ] [c1rMKJf] publish_address {10.0.0.4:9300}, bound_addresses {10.0.0.4:9300}
dev_elasticsearch.0.wjr53bkc7plt@docker2 | [2017-10-02T22:16:21,407][INFO ][o.e.b.BootstrapChecks ] [c1rMKJf] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
dev_elasticsearch.0.wjr53bkc7plt@docker2 | [2017-10-02T22:16:21,427][WARN ][o.e.d.z.UnicastZenPing ] [c1rMKJf] failed to resolve host [tasks.elasticsearch]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | java.net.UnknownHostException: tasks.elasticsearch: Name does not resolve
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) ~[?:1.8.0_131]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:928) ~[?:1.8.0_131]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1323) ~[?:1.8.0_131]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at java.net.InetAddress.getAllByName0(InetAddress.java:1276) ~[?:1.8.0_131]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at java.net.InetAddress.getAllByName(InetAddress.java:1192) ~[?:1.8.0_131]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at java.net.InetAddress.getAllByName(InetAddress.java:1126) ~[?:1.8.0_131]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at org.elasticsearch.transport.TcpTransport.parse(TcpTransport.java:921) ~[elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at org.elasticsearch.transport.TcpTransport.addressesFromString(TcpTransport.java:876) ~[elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at org.elasticsearch.transport.TransportService.addressesFromString(TransportService.java:691) ~[elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at org.elasticsearch.discovery.zen.UnicastZenPing.lambda$null$0(UnicastZenPing.java:212) ~[elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_131]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
dev_elasticsearch.0.wjr53bkc7plt@docker2 | at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
...
Without healthcheck block, errors does not appear. Why? It appears similar to: https://forums.docker.com/t/healthcheck-differences-between-docker-compose-and-docker-engine-swarm/29126 , but healthcheck should not fail in this case.
@kladiv can you try to use: discovery.zen.ping.unicast.hosts=tasks.dev_elasticsearch, where dev is the same as docker stack deploy -c compose.yml dev
@fcrisciani the same kind of error occur:
...
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | [2017-10-02T23:22:27,915][WARN ][o.e.n.Node ] [g-8rhE5] timed out while waiting for initial discovery state - timeout: 30s
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | [2017-10-02T23:22:27,925][INFO ][o.e.h.n.Netty4HttpServerTransport] [g-8rhE5] publish_address {10.0.0.2:9200}, bound_addresses {0.0.0.0:9200}
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | [2017-10-02T23:22:27,925][INFO ][o.e.n.Node ] [g-8rhE5] started
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | [2017-10-02T23:22:27,965][WARN ][o.e.d.z.ZenDiscovery ] [g-8rhE5] not enough master nodes discovered during pinging (found [[Candidate{node={g-8rhE5}{g-8rhE5SQMmXKv5GmC7pHQ}{xfzLk5EgQCW7Ikm4-3nGdg}{10.0.0.3}{10.0.0.3:9300}, clusterStateVersion=-1}]], but needed [2]), pinging again
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | [2017-10-02T23:22:27,965][WARN ][o.e.d.z.UnicastZenPing ] [g-8rhE5] failed to resolve host [tasks.dev_elasticsearch]
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | java.net.UnknownHostException: tasks.dev_elasticsearch
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | at java.net.InetAddress.getAllByName0(InetAddress.java:1280) ~[?:1.8.0_131]
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | at java.net.InetAddress.getAllByName(InetAddress.java:1192) ~[?:1.8.0_131]
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | at java.net.InetAddress.getAllByName(InetAddress.java:1126) ~[?:1.8.0_131]
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | at org.elasticsearch.transport.TcpTransport.parse(TcpTransport.java:921) ~[elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | at org.elasticsearch.transport.TcpTransport.addressesFromString(TcpTransport.java:876) ~[elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | at org.elasticsearch.transport.TransportService.addressesFromString(TransportService.java:691) ~[elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | at org.elasticsearch.discovery.zen.UnicastZenPing.lambda$null$0(UnicastZenPing.java:212) ~[elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_131]
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:569) [elasticsearch-5.5.2.jar:5.5.2]
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_131]
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_131]
dev_elasticsearch.0.mcgbp2tz9k7m@docker1 | at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
...
@kladiv I just tested your compose file and it works for me. True, there are DNS resolution errors at the beginning where tasks.elasticsearch is not resolvable but it is resolvable after ~20s or so and then the cluster forms. ES has a networkaddress.cache.negative.ttl=10
setting so it will cache the fact that tasks.elasticsearch doesn't resolve for 10s, combined with the retry it will take some time. But in the end:
vagrant@m01:~$ curl 192.168.124.100:9200/_cat/nodes
10.0.0.3 30 56 0 0.00 0.14 0.26 mdi - wI0sOw1
10.0.0.4 23 55 1 0.08 0.21 0.30 mdi - SupF8Tv
10.0.0.5 29 60 1 0.01 0.19 0.35 mdi - ZA3dlr1
10.0.0.8 32 55 2 0.19 0.23 0.30 mdi * cvslCwk
10.0.0.7 28 63 0 0.13 0.24 0.33 mdi - V5h5Vxq
10.0.0.6 21 60 0 0.00 0.16 0.31 mdi - t4TrjmZ
my test swarm has 3m+3w.
initially:
/usr/share/elasticsearch # ping tasks.elasticsearch
ping: bad address 'tasks.elasticsearch'
but then:
/usr/share/elasticsearch # ping tasks.elasticsearch
PING tasks.elasticsearch (10.0.0.7): 56 data bytes
64 bytes from 10.0.0.7: seq=0 ttl=64 time=0.056 ms
64 bytes from 10.0.0.7: seq=1 ttl=64 time=0.105 ms
^C
--- tasks.elasticsearch ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.056/0.080/0.105 ms
/usr/share/elasticsearch # dig elasticsearch
....
;; QUESTION SECTION:
;tasks.elasticsearch. IN A
;; ANSWER SECTION:
tasks.elasticsearch. 600 IN A 10.0.0.8
tasks.elasticsearch. 600 IN A 10.0.0.7
tasks.elasticsearch. 600 IN A 10.0.0.3
tasks.elasticsearch. 600 IN A 10.0.0.6
tasks.elasticsearch. 600 IN A 10.0.0.4
tasks.elasticsearch. 600 IN A 10.0.0.5
@fcrisciani Okey, i will test asap. What do you think is the cause of this difference? It's strange that only without healthcheck the starting is faster.
Thanks @muresan I did not know of that caching that Elastic was doing, it all actually makes sense. @kladiv, the health check has the specific purpose to validate that the application inside the container is actually ready to do work, so till the container is not mark as healthy it won't appear as a possible destination through the DNS or the internal load balancer. The reasoning behind it is to avoid that scaling up a service actually adds immediately tasks not capable to handle work yet. In this case the container with elastic start but takes some time before be marked as healthy so if the other tries to resolve it they won't see it immediately and because of the caching also when it activates properly the other elastic instances are not retrying the DNS resolution showing the same empty list for the tasks. Let me know if you have any question
@fcrisciani thank you for your examples, they work good.
However maybe someone can help me. I cannot figure how to access this cluster outside of the docker network, I tried changing ports to: - "0.0.0.0:9200:9200" and still cannot get access to it. It is available only from inside of the any of the docker containers.
curl http://localhost:9200 curl: (7) Failed to connect to localhost port 9200: Connection refused
Any ideas how to expose it to master server?
@darklow you need to expose the port. If you run docker service create you need to specify -p 9200:9200 else if you use the compose file that I added that should already work. You can have issues only if the port is already used by some other service
@fcrisciani I use docker-compose.yml and command I use to deply is:
docker stack deploy --with-registry-auth -c deploy/swarm/docker-compose.yml cp
docker-compose.yml:
version: "3.3"
services:
es:
image: elasticsearch:5.6-alpine
command: [ elasticsearch, -E, "network.host=_eth0:ipv4_", -E, discovery.zen.ping.unicast.hosts=tasks.es, -E, discovery.zen.minimum_master_nodes=2, -E, cluster.name=my-cluster ]
ports:
- "9200:9200"
- "9300:9300"
environment:
- "ES_JAVA_OPTS=-Xms2g -Xmx2g"
networks:
- cp
deploy:
replicas: 2
networks:
cp:
attachable: true
However when I login into any of instances and try access 9200 port then I got Connection refused:
root@cp1:~# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
765cd85c977f elasticsearch:5.6-alpine "/docker-entrypoin..." 4 hours ago Up 4 hours 9200/tcp, 9300/tcp cp_es.2.ngv5kpos75m8hvpsmyux5gzzp
root@cp1:~# curl http://localhost:9200
curl: (7) Failed to connect to localhost port 9200: Connection refused
But if I login into any of docker instances then I see that ES is running:
root@cp1:~# docker exec -it 765 /usr/bin/wget http://tasks.es:9200/ -O-
Connecting to tasks.es:9200 (10.0.0.4:9200)
{
"name" : "ISDgGMQ",
"cluster_name" : "my-cluster",
"cluster_uuid" : "1xY0qSqqSGSBAqzVltSVlw",
"version" : {
"number" : "5.6.4",
"build_hash" : "8bbedf5",
"build_date" : "2017-10-31T18:55:38.105Z",
"build_snapshot" : false,
"lucene_version" : "6.6.1"
},
"tagline" : "You Know, for Search"
}
Can you try curl -4 to force the use of ipv4?
@fcrisciani Thanks for trying to help, unfortunately curl -4 didn't helped. I figured out and it was completely different issue and apparently it is by design like this. I tried changing ports parameters on my docker-compose.yml
and noticed that it doesn't affect the way ports are exposed outside of swarm cluster, even if I put 9205:9205 it would still show 9200/tcp in docker ps and docker inspect.
Until I found this: https://docs.docker.com/engine/swarm/services/#publish-a-services-ports-directly-on-the-swarm-node
So I needed specifically to mention mode: host
and now it works!
ports:
- published: 9200
target: 9200
protocol: tcp
mode: host
Not sure if this is only to recent swarm/docker versions or always been like that, but it finally works and I can access elasticsearch within swarm cluster and from the any instance as well.
Although to be honest I still don't get why I can't access without using mode:host
, which automatically exposes port as 0.0.0.0:9200 while I would like to expose it on localhost, but "9200:9200" or "127.0.0.1:9200:9200" doesn't work and I receive Connection refused :/
@darklow it really depends on what you want to do. If you want to expose a port at the level of the cluster so that any swarm mode will expose your service no matter which is the node where the container run that you can use the compose that I was using. Only thing to notice on Linux machines is that if you have 2 stack ipv4 and ipv6, ipv6 is preferred so the curl command can fail because try to use ipv6. If you instead want to expose a port only on the specific node then yes host mode is the way to go. You can also change the public port exposed if the 2 numbers are not matching like -p 10000:5000, but this means you are exposing port 10000 on the host and this will be hooked up to the port 5000 inside the container. So for elastic you can do -p 10000:9200 and you should be able to reach elastic using the IP of the host and port 10000.
What I was trying to achieve apparently is not possible at the moment (map 127.0.01:9200:9200, so that I can use nginx outside of swarm cluster to proxy the 9200 port, but at the same time without exposing 9200 to public), here is the specific issue on docker: https://github.com/moby/moby/issues/32299 So now that I know this I decided to have nginx inside - as part of the swarm cluster and will proxy 9200 port to public using exposed port of nginx with some basic auth (which actually makes more sense).
Feature Description
Please provide native support for Docker Swarm stacks.
I found an unofficial patch here: https://github.com/a-goryachev/docker-swarm-elasticsearch but would prefer an official solution.