elastic / elasticsearch-cloud-aws

AWS Cloud Plugin for Elasticsearch
https://github.com/elastic/elasticsearch/tree/master/plugins/discovery-ec2
577 stars 181 forks source link

EC2 discovery does not work in us-west-2 (ES 1.7.5 plugin 2.7.1) #277

Closed bvorosadmin closed 7 years ago

bvorosadmin commented 7 years ago

Hello,

I have been using the EC2 discovery plugin successfully in eu-west-1 (Ireland) without any issues. However, in us-west-2 (Oregon) it would only work intermittently until last week when it stopped completely.

Everything is set up the same way. There is an IAM role created with the allow describe-instances policy.

The aws cli command on the instance works and the response is fast.

aws ec2 describe-instances --region=us-west-2 --filters "Name=instance.group-name,Values=elasticsearchCluster"

IAM role and policy: { "Version": "2012-10-17", "Statement": [ { "Action": [ "ec2:DescribeInstances" ], "Effect": "Allow", "Resource": [ "*" ] } ] }

Elasticsearch.yml is the same in both regions, apart from the region: bootstrap.mlockall: true discovery.zen.ping.multicast.enabled: false cluster.name: thisEsCluster node.name: ${HOSTNAME} cloud.aws.region: us-west-2|eu-west-1 discovery.type: ec2 discovery.groups: elasticsearchCluster discovery.ping_timeout: 60s cloud.node.auto_attributes: true discovery.zen.minimum_master_nodes: 2

Log from an affected node: [2016-08-02 09:54:02,095][INFO ][node ] [ip-10-20-30-40] version[1.7.5], pid[3079], build[00f95f4/2016-02-02T09:55:30Z] [2016-08-02 09:54:02,095][INFO ][node ] [ip-10-20-30-40] initializing ... [2016-08-02 09:54:02,162][INFO ][plugins ] [ip-10-20-30-40] loaded [cloud-aws], sites [] [2016-08-02 09:54:02,193][INFO ][env ] [ip-10-20-30-40] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [119.2gb], net total_space [125.8gb], types [ext4] [2016-08-02 09:54:05,185][INFO ][node ] [ip-10-20-30-40] initialized [2016-08-02 09:54:05,185][INFO ][node ] [ip-10-20-30-40] starting ... [2016-08-02 09:54:05,321][INFO ][transport ] [ip-10-20-30-40] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/10.20.30.40:9300]} [2016-08-02 09:54:05,345][INFO ][discovery ] [ip-10-20-30-40] thisEsCluster/K0C8nEk9QYyk2g2A5eEitQ [2016-08-02 09:54:35,345][WARN ][discovery ] [ip-10-20-30-40] waited for 30s and no initial state was set by the discovery [2016-08-02 09:54:35,352][INFO ][http ] [ip-10-20-30-40] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/10.20.30.40:9200]} [2016-08-02 09:54:35,353][INFO ][node ] [ip-10-20-30-40] started [2016-08-02 09:57:56,154][DEBUG][action.admin.indices.get ] [ip-10-20-30-40] no known master node, scheduling a retry [2016-08-02 09:58:26,155][DEBUG][action.admin.indices.get ] [ip-10-20-30-40] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s] [2016-08-02 10:03:33,188][DEBUG][action.admin.indices.get ] [ip-10-20-30-40] no known master node, scheduling a retry [2016-08-02 10:04:03,188][DEBUG][action.admin.indices.get ] [ip-10-20-30-40] observer: timeout notification from cluster service. timeout setting [30s], time since start [30s] [2016-08-02 10:07:32,964][DEBUG][action.admin.indices.get ] [ip-10-20-30-40] no known master node, scheduling a retry

Where and what shall I look at to determine the cause? The role and policy is identical and the ES clusters are also launched by a script, everything is the same. Not sure if other regions are affected.

Thanks in advance,

dadoonet commented 7 years ago

You could try to put discovery and cloud packages to TRACE in logging.yml. May be it will give some clues?

If it does not add traces, try with org.elasticsearch.discovery and org.elasticsearch.cloud.

bvorosadmin commented 7 years ago

Thanks a lot for the quick response. This can now be closed. It was human error. Typo in elasticsearch.yml. I am sorry. As a result discovery was runing with default values that worked in the EU region but not in Oregon.

The discovery statements were missing the "ec2" bit.

All is well with the correct settings: bootstrap.mlockall: true discovery.zen.ping.multicast.enabled: false cluster.name: thisEsCluster node.name: ${HOSTNAME} cloud.aws.region: us-west-2 discovery.type: ec2 discovery.ec2.groups: elasticsearchCluster discovery.ec2.any_group: false discovery.ec2.ping_timeout: 60s cloud.node.auto_attributes: true discovery.zen.minimum_master_nodes: 2

dadoonet commented 7 years ago

I see. Thanks for the follow up. BTW next time please use discuss.elastic.co instead.