elastic / elasticsearch-cloud-aws

AWS Cloud Plugin for Elasticsearch
https://github.com/elastic/elasticsearch/tree/master/plugins/discovery-ec2
577 stars 181 forks source link

Exception while retrieving instance list from AWS API #219

Closed jalateras closed 8 years ago

jalateras commented 9 years ago

I have recently upgraded to ES 1.6 and cloud-aws 2.6.0. When i restarted the server i started getting these errors

[2015-06-23 11:59:48,497][INFO ][discovery.ec2            ] [Staging Server] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 0cb1a125-8f6b-43ad-910e-5de4201d518e)
[2015-06-23 11:59:53,514][INFO ][discovery.ec2            ] [Staging Server] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 130aca55-aec2-48de-838e-b4125f15ae7c)
[2015-06-23 11:59:58,526][INFO ][discovery.ec2            ] [Staging Server] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: c70385a2-9cba-40b4-b190-1efab0024b0a)

but the server still starts up correctly. I am curious why this is happening since i am using the same credentials as before the upgrade

################################## Discovery ##################################

discovery.type: ec2
discovery.ec2.groups: xxxx-instance

discovery.zen.minimum_master_nodes: 1

cloud.node.auto_attributes: true
cloud.aws.access_key: xxxxxxxxx
cloud.aws.secret_key: xxxxxxx
cloud.aws.region: xxxxxx

discovery.zen.ping.timeout: 10s
discovery.zen.ping.retries: 5
discovery.zen.ping.interval: 15s

Any ideas?

jalateras commented 9 years ago

The credentials i am using for the plugin are the same credentials i am using with the aws cli tool and i can definitely get a list of instances using 'aws ec2 describe-instances'

dadoonet commented 9 years ago

Could you modify config/logging.yml and set discovery to TRACE? Then paste here the logs?

Feel free to replace all credential in logs by XXXX if needed.

dadoonet commented 9 years ago

BTW which versions (elasticsearch and cloud-aws plugin) were you using before the upgrade?

jalateras commented 9 years ago

i upgraded from 2.4.1

jalateras commented 9 years ago

Here are my discovery settings

discovery.type: ec2
discovery.ec2.groups: xxxxxxx
discovery.zen.minimum_master_nodes: 1

cloud.node.auto_attributes: true
cloud.aws.access_key: xxxxxxxx
cloud.aws.secret_key:xxxxxxxx
cloud.aws.region: ap-southeast-2

discovery.zen.ping.timeout: 10s discovery.zen.ping.retries: 5 discovery.zen.ping.interval: 15s

jalateras commented 9 years ago

Here is the trace level logging

[2015-06-24 08:10:46,523][INFO ][node                     ] [Staging Server] version[1.6.0], pid[15848], build[cdd3ac4/2015-06-09T13:36:34Z]
[2015-06-24 08:10:46,523][INFO ][node                     ] [Staging Server] initializing ...
[2015-06-24 08:10:46,541][INFO ][plugins                  ] [Staging Server] loaded [cloud-aws], sites [head, kopf, bigdesk]
[2015-06-24 08:10:46,581][INFO ][env                      ] [Staging Server] using [1] data paths, mounts [[/vol (/dev/xvdd)]], net usable_space [147.2gb], net total_space [157.3gb], types [ext4]
[2015-06-24 08:10:48,378][DEBUG][discovery.zen.elect      ] [Staging Server] using minimum_master_nodes [1]
[2015-06-24 08:10:48,380][DEBUG][discovery.zen.ping.multicast] [Staging Server] using group [224.2.2.4], with port [54328], ttl [3], and address [null]
[2015-06-24 08:10:48,384][DEBUG][discovery.zen.ping.unicast] [Staging Server] using initial hosts [], with concurrent_connects [10]
[2015-06-24 08:10:48,385][DEBUG][discovery.ec2            ] [Staging Server] using ping.timeout [3s], join.timeout [1m], master_election.filter_client [true], master_election.filter_data [false]
[2015-06-24 08:10:48,386][DEBUG][discovery.zen.fd         ] [Staging Server] [master] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2015-06-24 08:10:48,388][DEBUG][discovery.zen.fd         ] [Staging Server] [node  ] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2015-06-24 08:10:48,842][DEBUG][discovery.ec2            ] [Staging Server] using host_type [PRIVATE_IP], tags [{}], groups [[oztam-instance]] with any_group [true], availability_zones [[]]
[2015-06-24 08:10:49,094][WARN ][script                   ] [Staging Server] deprecated setting [script.disable_dynamic] is set, replace with fine-grained scripting settings (e.g. script.inline, script.indexed, script.file)
[2015-06-24 08:10:49,401][INFO ][node                     ] [Staging Server] initialized
[2015-06-24 08:10:49,401][INFO ][node                     ] [Staging Server] starting ...
[2015-06-24 08:10:49,499][INFO ][transport                ] [Staging Server] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.31.7.102:9300]}
[2015-06-24 08:10:49,518][INFO ][discovery                ] [Staging Server] oztam-stage-es-cluster/exwa-eVoQvKZ0S-T0-yBqA
[2015-06-24 08:10:49,519][TRACE][discovery.ec2            ] [Staging Server] starting to ping
[2015-06-24 08:10:49,965][INFO ][discovery.ec2            ] [Staging Server] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: b68c0b0a-db57-4065-9914-1cd7eab59ed0)
[2015-06-24 08:10:49,965][DEBUG][discovery.ec2            ] [Staging Server] Full exception:
com.amazonaws.AmazonServiceException: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: b68c0b0a-db57-4065-9914-1cd7eab59ed0)
    at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1127)
    at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:743)
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:462)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:297)
    at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:10903)
    at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5350)
    at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:110)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:313)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:219)
    at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:146)
    at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:124)
    at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:996)
    at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:360)
    at org.elasticsearch.discovery.zen.ZenDiscovery.access$6100(ZenDiscovery.java:85)
    at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1373)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
[2015-06-24 08:10:49,968][TRACE][discovery.zen.ping.unicast] [Staging Server] [1] connecting to [Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}
[2015-06-24 08:10:49,988][TRACE][discovery.zen.ping.unicast] [Staging Server] [1] connected to [Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}
[2015-06-24 08:10:49,988][TRACE][discovery.zen.ping.unicast] [Staging Server] [1] sending to [Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}
[2015-06-24 08:10:50,015][TRACE][discovery.zen.ping.unicast] [Staging Server] [1] received response from [Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}: [ping_response{node [[Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}], id[1], master [null], hasJoinedOnce [false], cluster_name[oztam-stage-es-cluster]}, ping_response{node [[Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}], id[2], master [null], hasJoinedOnce [false], cluster_name[oztam-stage-es-cluster]}]
[2015-06-24 08:10:54,985][INFO ][discovery.ec2            ] [Staging Server] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 52cc2d15-47c9-4788-aa61-8da8a8e1df12)
[2015-06-24 08:10:54,985][DEBUG][discovery.ec2            ] [Staging Server] Full exception:
com.amazonaws.AmazonServiceException: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 52cc2d15-47c9-4788-aa61-8da8a8e1df12)
    at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1127)
    at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:743)
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:462)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:297)
    at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:10903)
    at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5350)
    at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:110)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:313)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$2.doRun(UnicastZenPing.java:228)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
[2015-06-24 08:10:54,985][TRACE][discovery.zen.ping.unicast] [Staging Server] [1] sending to [Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}
[2015-06-24 08:10:54,988][TRACE][discovery.zen.ping.unicast] [Staging Server] [1] received response from [Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}: [ping_response{node [[Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}], id[1], master [null], hasJoinedOnce [false], cluster_name[oztam-stage-es-cluster]}, ping_response{node [[Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}], id[3], master [null], hasJoinedOnce [false], cluster_name[oztam-stage-es-cluster]}, ping_response{node [[Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}], id[4], master [null], hasJoinedOnce [false], cluster_name[oztam-stage-es-cluster]}]
[2015-06-24 08:11:00,000][INFO ][discovery.ec2            ] [Staging Server] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 38aecbc6-9f5b-411a-a140-15b52d5193ac)
[2015-06-24 08:11:00,000][DEBUG][discovery.ec2            ] [Staging Server] Full exception:
com.amazonaws.AmazonServiceException: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 38aecbc6-9f5b-411a-a140-15b52d5193ac)
    at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1127)
    at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:743)
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:462)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:297)
    at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:10903)
    at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5350)
    at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:110)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:313)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$2$1.doRun(UnicastZenPing.java:232)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
[2015-06-24 08:11:00,001][TRACE][discovery.zen.ping.unicast] [Staging Server] [1] sending to [Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}
[2015-06-24 08:11:00,003][TRACE][discovery.zen.ping.unicast] [Staging Server] [1] received response from [Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}: [ping_response{node [[Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}], id[1], master [null], hasJoinedOnce [false], cluster_name[oztam-stage-es-cluster]}, ping_response{node [[Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}], id[3], master [null], hasJoinedOnce [false], cluster_name[oztam-stage-es-cluster]}, ping_response{node [[Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}], id[5], master [null], hasJoinedOnce [false], cluster_name[oztam-stage-es-cluster]}, ping_response{node [[Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}], id[6], master [null], hasJoinedOnce [false], cluster_name[oztam-stage-es-cluster]}]
[2015-06-24 08:11:00,004][TRACE][discovery.ec2            ] [Staging Server] full ping responses: {none}
[2015-06-24 08:11:00,004][DEBUG][discovery.ec2            ] [Staging Server] filtered ping responses: (filter_client[true], filter_data[false]) {none}
[2015-06-24 08:11:00,012][INFO ][cluster.service          ] [Staging Server] new_master [Staging Server][exwa-eVoQvKZ0S-T0-yBqA][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}, reason: zen-disco-join (elected_as_master)
[2015-06-24 08:11:00,014][TRACE][discovery.ec2            ] [Staging Server] cluster joins counter set to [1] (elected as master)
[2015-06-24 08:11:00,058][INFO ][http                     ] [Staging Server] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.31.7.102:9200]}
[2015-06-24 08:11:00,059][INFO ][node                     ] [Staging Server] started
[2015-06-24 08:11:00,299][INFO ][gateway                  ] [Staging Server] recovered [32] indices into cluster_state
dadoonet commented 9 years ago

Are you using OpenJDK or Oracle JVM? If the former, can you install Oracle JVM and run the same test again?

jalateras commented 9 years ago

i have Oracle JVM installed.

See http://www.oracle.com/technetwork/java/javase/documentation/index.html for more details.
ubuntu@oztam-avmh-dev-01:/etc/elasticsearch$ java -version
java version "1.7.0_72"
Java(TM) SE Runtime Environment (build 1.7.0_72-b14)
Java HotSpot(TM) 64-Bit Server VM (build 24.72-b04, mixed mode)
dadoonet commented 9 years ago

I tried a test but on a different zone and did not get the same result. I'll try it again on the same zone as you did.

jalateras commented 9 years ago

Thanks for digging in... really strange if it's working for one region and not another.

dadoonet commented 9 years ago

Could you run another test for me please? Could you install again version 1.4.x and the plugin version you were using, same settings and start your node?

Then try 1.5 and 2.5.x plugin?

Thanks!

jalateras commented 9 years ago

Using es 1.4.4 ans aws-cloud 2.4.2 i get the same issue

[2015-06-25 12:50:45,090][INFO ][node                     ] [[2015-06-25 12:50:45,090][INFO ][node                     ] [Staging Server] version[1.4.4], pid[7619], build[c88f77f/2015-02-19T13:05:36Z]
[2015-06-25 12:50:45,090][INFO ][node                     ] [Staging Server] initializing ...
[2015-06-25 12:50:45,109][INFO ][plugins                  ] [Staging Server] loaded [cloud-aws], sites [head, kopf, bigdesk]
[2015-06-25 12:50:47,835][INFO ][node                     ] [Staging Server] initialized
[2015-06-25 12:50:47,835][INFO ][node                     ] [Staging Server] starting ...
[2015-06-25 12:50:47,930][INFO ][transport                ] [Staging Server] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.31.7.102:9300]}
[2015-06-25 12:50:47,948][INFO ][discovery                ] [Staging Server] oztam-stage-es-cluster/iqorPbdjSF-NGdYclQQWjg
[2015-06-25 12:50:48,431][INFO ][discovery.ec2            ] [Staging Server] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: be4029a0-b075-44ab-a6aa-c71a46e23470)
[2015-06-25 12:50:53,448][INFO ][discovery.ec2            ] [Staging Server] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 26c77a26-151c-4c32-a3b3-4508c5bf0cbb)
[2015-06-25 12:50:58,464][INFO ][discovery.ec2            ] [Staging Server] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 77294bd5-033c-402a-a853-050badeb608e)
[2015-06-25 12:50:58,474][INFO ][cluster.service          ] [Staging Server] new_master [Staging Server][iqorPbdjSF-NGdYclQQWjg][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}, reason: zen-disco-join (elected_as_master)
[2015-06-25 12:50:58,510][INFO ][http                     ] [Staging Server] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.31.7.102:9200]}
[2015-06-25 12:50:58,510][INFO ][node                     ] [Staging Server] started
[2015-06-25 12:50:58,556][INFO ][gateway                  ] [Staging Server] recovered [16] indices into cluster_state
[2015-06-25 12:51:03,636][INFO ][cluster.metadata         ] [Staging Server] [oztam_meter_index_20150625] creating index, cause [auto(bulk api)], shards [5]/[0], mappings []
[2015-06-25 12:51:04,039][INFO ][cluster.metadata         ] [Staging Server] [oztam_meter_index_20150625] update_mapping [oztam_meter_event] (dynamic)
[2015-06-25 12:51:04,106][INFO ][cluster.metadata         ] [Staging Server] [oztam_meter_index_20150625] update_mapping [oztam_meter_event] (dynamic)
[2015-06-25 12:51:04,288][INFO ][cluster.metadata         ] [Staging Server] [oztam_meter_index_20150625] update_mapping [oztam_meter_event] (dynamic)
[2015-06-25 12:51:34,297][INFO ][cluster.metadata         ] [Staging Server] [oztam_meter_index_20150625] update_mapping [oztam_meter_event] (dynamic)
jalateras commented 9 years ago

Same with es 1.5.2 and 2.5.1 plugin

[2015-06-25 12:56:48,143][INFO ][node                     ] [Staging Server] version[1.5.2], pid[8074], build[62ff986/2015-04-27T09:21:06Z]
[2015-06-25 12:56:48,143][INFO ][node                     ] [Staging Server] initializing ...
[2015-06-25 12:56:48,161][INFO ][plugins                  ] [Staging Server] loaded [cloud-aws], sites [head, kopf, bigdesk]
[2015-06-25 12:56:50,981][INFO ][node                     ] [Staging Server] initialized
[2015-06-25 12:56:50,982][INFO ][node                     ] [Staging Server] starting ...
[2015-06-25 12:56:51,100][INFO ][transport                ] [Staging Server] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.31.7.102:9300]}
[2015-06-25 12:56:51,117][INFO ][discovery                ] [Staging Server] oztam-stage-es-cluster/mgF_EHQyTOi1nIIpwvI06g
[2015-06-25 12:56:51,558][INFO ][discovery.ec2            ] [Staging Server] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 08961cac-c546-4ebd-8341-53270308a918)
[2015-06-25 12:56:56,580][INFO ][discovery.ec2            ] [Staging Server] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: b1044b4f-d802-4e5b-9a19-f0231c84feab)
[2015-06-25 12:57:01,594][INFO ][discovery.ec2            ] [Staging Server] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 5269da81-b466-433c-958f-4459655fb308)
[2015-06-25 12:57:01,604][INFO ][cluster.service          ] [Staging Server] new_master [Staging Server][mgF_EHQyTOi1nIIpwvI06g][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}, reason: zen-disco-join (elected_as_master)
[2015-06-25 12:57:01,657][INFO ][http                     ] [Staging Server] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.31.7.102:9200]}
[2015-06-25 12:57:01,657][INFO ][node                     ] [Staging Server] started
[2015-06-25 12:57:02,312][INFO ][gateway                  ] [Staging Server] recovered [32] indices into cluster_state
dadoonet commented 9 years ago

Interesting.

Could you run a last test? With 2.4.1?

I want to make sure it was caused by a change in 2.4.2. We updated AWS SDK for example.

jalateras commented 9 years ago

Yep that works

[2015-06-25 14:12:15,389][INFO ][node                     ] [Staging Server] stopping ...
[2015-06-25 14:12:15,564][INFO ][node                     ] [Staging Server] stopped
[2015-06-25 14:12:15,565][INFO ][node                     ] [Staging Server] closing ...
[2015-06-25 14:12:15,656][INFO ][node                     ] [Staging Server] closed
[2015-06-25 14:14:17,210][INFO ][node                     ] [Staging Server] version[1.4.4], pid[9901], build[c88f77f/2015-02-19T13:05:36Z]
[2015-06-25 14:14:17,210][INFO ][node                     ] [Staging Server] initializing ...
[2015-06-25 14:14:17,226][INFO ][plugins                  ] [Staging Server] loaded [cloud-aws], sites [head, kopf, bigdesk]
[2015-06-25 14:14:19,944][INFO ][node                     ] [Staging Server] initialized
[2015-06-25 14:14:19,944][INFO ][node                     ] [Staging Server] starting ...
[2015-06-25 14:14:20,043][INFO ][transport                ] [Staging Server] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.31.7.102:9300]}
[2015-06-25 14:14:20,060][INFO ][discovery                ] [Staging Server] oztam-stage-es-cluster/RRHp33DVQ0WbKSZLFsfBsg
[2015-06-25 14:14:31,069][INFO ][cluster.service          ] [Staging Server] new_master [Staging Server][RRHp33DVQ0WbKSZLFsfBsg][oztam-avmh-dev-01][inet[/172.31.7.102:9300]]{aws_availability_zone=ap-southeast-2b, master=true}, reason: zen-disco-join (elected_as_master)
[2015-06-25 14:14:31,103][INFO ][http                     ] [Staging Server] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.31.7.102:9200]}
[2015-06-25 14:14:31,103][INFO ][node                     ] [Staging Server] started
[2015-06-25 14:14:31,152][INFO ][gateway                  ] [Staging Server] recovered [16] indices into cluster_state
[2015-06-25 14:14:34,060][INFO ][cluster.metadata         ] [Staging Server] [oztam_meter_index_20150625] creating index, cause [auto(bulk api)], shards [5]/[0], mappings []
[2015-06-25 14:14:34,459][INFO ][cluster.metadata         ] [Staging Server] [oztam_meter_index_20150625] update_mapping [oztam_meter_event] (dynamic)
[2015-06-25 14:14:34,529][INFO ][cluster.metadata         ] [Staging Server] [oztam_meter_index_20150625] update_mapping [oztam_meter_event] (dynamic)
dadoonet commented 9 years ago

Ok. Super strange.

I ran a test:

elasticsearch.yml is:

discovery.type: ec2
cloud.aws.access_key: MY_KEY
cloud.aws.secret_key: MY_SECRET
cloud.aws.region: ap-southeast-2

Then run bin/elasticsearch:

[2015-06-25 10:02:49,767][INFO ][node                     ] [Molecule Man] version[1.6.0], pid[7057], build[cdd3ac4/2015-06-09T13:36:34Z]
[2015-06-25 10:02:49,767][INFO ][node                     ] [Molecule Man] initializing ...
[2015-06-25 10:02:49,786][INFO ][plugins                  ] [Molecule Man] loaded [cloud-aws], sites []
[2015-06-25 10:02:49,825][INFO ][env                      ] [Molecule Man] using [1] data paths, mounts [[/ (rootfs)]], net usable_space [6.2gb], net total_space [7.7gb], types [rootfs]
[2015-06-25 10:02:52,158][DEBUG][discovery.zen.elect      ] [Molecule Man] using minimum_master_nodes [-1]
[2015-06-25 10:02:52,160][DEBUG][discovery.zen.ping.multicast] [Molecule Man] using group [224.2.2.4], with port [54328], ttl [3], and address [null]
[2015-06-25 10:02:52,164][DEBUG][discovery.zen.ping.unicast] [Molecule Man] using initial hosts [], with concurrent_connects [10]
[2015-06-25 10:02:52,165][DEBUG][discovery.ec2            ] [Molecule Man] using ping.timeout [3s], join.timeout [1m], master_election.filter_client [true], master_election.filter_data [false]
[2015-06-25 10:02:52,166][DEBUG][discovery.zen.fd         ] [Molecule Man] [master] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2015-06-25 10:02:52,168][DEBUG][discovery.zen.fd         ] [Molecule Man] [node  ] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2015-06-25 10:02:52,786][DEBUG][discovery.ec2            ] [Molecule Man] using host_type [PRIVATE_IP], tags [{}], groups [[]] with any_group [true], availability_zones [[]]
[2015-06-25 10:02:53,578][INFO ][node                     ] [Molecule Man] initialized
[2015-06-25 10:02:53,578][INFO ][node                     ] [Molecule Man] starting ...
[2015-06-25 10:02:53,659][INFO ][transport                ] [Molecule Man] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.30.0.176:9300]}
[2015-06-25 10:02:53,670][INFO ][discovery                ] [Molecule Man] elasticsearch/95daKWdURB6AqsdX8CrRXg
[2015-06-25 10:02:53,671][TRACE][discovery.ec2            ] [Molecule Man] starting to ping
[2015-06-25 10:02:54,717][TRACE][discovery.ec2            ] [Molecule Man] building dynamic unicast discovery nodes...
[2015-06-25 10:02:54,766][TRACE][discovery.ec2            ] [Molecule Man] adding i-0307f2dd, address 172.30.0.176, transport_address inet[/172.30.0.176:9300]
[2015-06-25 10:02:54,782][DEBUG][discovery.ec2            ] [Molecule Man] using dynamic discovery nodes [[#cloud-i-0307f2dd-0][ip-172-30-0-176][inet[/172.30.0.176:9300]]]

So EC2 API calls worked correctly on my end.

Then I changed config/logging.yml file to get more details about what is happening:

# you can override ths using by setting a system property, for example -Des.logger.level=DEBUG
es.logger.level: WARN
rootLogger: ${es.logger.level}, console, file
logger:
  com.amazonaws: DEBUG
  org.apache.http: INFO
  discovery: TRACE

Could you do the same and run again with 1.6.0 / 2.6.0 so we can compare our traces?

Here are mine:

[2015-06-25 10:10:19,750][TRACE][discovery.ec2            ] [Dredmund Druid] starting to ping
[2015-06-25 10:10:19,792][DEBUG][com.amazonaws.metrics.AwsSdkMetrics] Admin mbean registered under com.amazonaws.management:type=AwsSdkMetrics
[2015-06-25 10:10:19,798][DEBUG][com.amazonaws.auth.AWSCredentialsProviderChain] Loading credentials from com.amazonaws.internal.StaticCredentialsProvider@2719b944
[2015-06-25 10:10:19,804][DEBUG][com.amazonaws.request    ] Sending Request: POST https://ec2.ap-southeast-2.amazonaws.com / Parameters: (Action: DescribeInstances, Version: 2015-03-01, Filter.1.Name: instance-state-name, Filter.1.Value.1: running, Filter.1.Value.2: pending, ) Headers: (User-Agent: aws-sdk-java/1.9.34 Linux/3.14.35-28.38.amzn1.x86_64 OpenJDK_64-Bit_Server_VM/24.79-b02/1.7.0_79, ) 
[2015-06-25 10:10:19,877][DEBUG][com.amazonaws.auth.AWS4Signer] AWS4 Canonical Request: '"POST
/

host:ec2.ap-southeast-2.amazonaws.com
user-agent:aws-sdk-java/1.9.34 Linux/3.14.35-28.38.amzn1.x86_64 OpenJDK_64-Bit_Server_VM/24.79-b02/1.7.0_79
x-amz-date:20150625T101019Z

host;user-agent;x-amz-date
xxxxxxxxxxxxxxxxx"
[2015-06-25 10:10:19,878][DEBUG][com.amazonaws.auth.AWS4Signer] AWS4 String to Sign: '"AWS4-HMAC-SHA256
20150625T101019Z
20150625/ap-southeast-2/ec2/aws4_request
xxxxxxxxxxxx"
[2015-06-25 10:10:19,886][DEBUG][com.amazonaws.auth.AWS4Signer] Generating a new signing key as the signing key not available in the cache for the date 1435190400000
[2015-06-25 10:10:20,038][DEBUG][com.amazonaws.http.conn.ssl.SdkTLSSocketFactory] socket.getSupportedProtocols(): [SSLv2Hello, SSLv3, TLSv1, TLSv1.1, TLSv1.2], socket.getEnabledProtocols(): [TLSv1]
[2015-06-25 10:10:20,039][DEBUG][com.amazonaws.http.conn.ssl.SdkTLSSocketFactory] TLS protocol enabled for SSL handshake: [TLSv1.2, TLSv1.1, TLSv1]
[2015-06-25 10:10:20,042][DEBUG][com.amazonaws.http.conn.ssl.SdkTLSSocketFactory] connecting to ec2.ap-southeast-2.amazonaws.com/54.240.193.78:443
[2015-06-25 10:10:20,444][DEBUG][com.amazonaws.internal.SdkSSLSocket] created: ec2.ap-southeast-2.amazonaws.com/54.240.193.78:443
[2015-06-25 10:10:20,467][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Attempt 1 to execute request
[2015-06-25 10:10:20,543][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Connection can be kept alive indefinitely
[2015-06-25 10:10:20,560][INFO ][com.amazonaws.http.AmazonHttpClient] x-amzn-RequestId: not available
[2015-06-25 10:10:20,653][DEBUG][com.amazonaws.request    ] Received successful response: 200, AWS Request ID: xxxxxxxxxxxx
[2015-06-25 10:10:20,654][INFO ][com.amazonaws.http.AmazonHttpClient] AWS Request ID: xxxxxxxxxxxx

Thanks!

cowboyrushforth commented 9 years ago

Hi,

Upon attempting an upgrade from 1.3.1 to 1.6.0, and upgrading the AWS cloud plugin to 2.6.0, I have run into the exact errors. I have followed the thread to this point and figured I would try the above logging output request. Here it is:

[2015-06-30 22:58:39,077][DEBUG][discovery.zen.elect      ] [bg-prod-elasticsearch-6] using minimum_master_nodes [1]
[2015-06-30 22:58:39,082][DEBUG][discovery.zen.ping.unicast] [bg-prod-elasticsearch-6] using initial hosts [], with concurrent_connects [10]  
[2015-06-30 22:58:39,083][DEBUG][discovery.ec2            ] [bg-prod-elasticsearch-6] using ping.timeout [3s], join.timeout [1m], master_election.filter_client [true], master_election.filter_data [false]
[2015-06-30 22:58:39,085][DEBUG][discovery.zen.fd         ] [bg-prod-elasticsearch-6] [master] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]  
[2015-06-30 22:58:39,087][DEBUG][discovery.zen.fd         ] [bg-prod-elasticsearch-6] [node  ] uses ping_interval [1s], ping_timeout [30s], ping_retries [3]
[2015-06-30 22:58:39,444][DEBUG][com.amazonaws.AmazonWebServiceClient] Internal logging succesfully configured to commons logger: true
[2015-06-30 22:58:39,820][DEBUG][discovery.ec2            ] [bg-prod-elasticsearch-6] using host_type [PRIVATE_IP], tags [{}], groups [[BG production elasticsearch]] with any_group [true], availability_zones [[]]
[2015-06-30 22:58:40,181][WARN ][script                   ] [bg-prod-elasticsearch-6] deprecated setting [script.disable_dynamic] is set, replace with fine-grained scripting settings (e.g. script.inline, script.indexed, script.file)  
[2015-06-30 22:58:40,631][INFO ][discovery                ] [bg-prod-elasticsearch-6] bg-prod-cluster/NU46ZB6NSvqnDRkvTzgq6w
[2015-06-30 22:58:40,633][TRACE][discovery.ec2            ] [bg-prod-elasticsearch-6] starting to ping
[2015-06-30 22:58:40,686][DEBUG][com.amazonaws.metrics.AwsSdkMetrics] Admin mbean registered under com.amazonaws.management:type=AwsSdkMetrics
[2015-06-30 22:58:40,692][DEBUG][com.amazonaws.auth.AWSCredentialsProviderChain] Unable to load credentials from EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
[2015-06-30 22:58:40,693][DEBUG][com.amazonaws.auth.AWSCredentialsProviderChain] Unable to load credentials from SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey)
[2015-06-30 22:58:40,694][DEBUG][com.amazonaws.internal.EC2MetadataClient] Connecting to EC2 instance metadata service at URL: http://169.254.169.254/latest/meta-data/iam/security-credentials/  
[2015-06-30 22:58:40,696][DEBUG][com.amazonaws.internal.EC2MetadataClient] Connecting to EC2 instance metadata service at URL: http://169.254.169.254/latest/meta-data/iam/security-credentials/bg_prod_elasticsearch  
[2015-06-30 22:58:40,778][DEBUG][com.amazonaws.auth.AWSCredentialsProviderChain] Loading credentials from InstanceProfileCredentialsProvider
[2015-06-30 22:58:40,788][DEBUG][com.amazonaws.request    ] Sending Request: POST https://ec2.us-west-1.amazonaws.com / Parameters: (Action: DescribeInstances, Version: 2015-03-01, Filter.1.Name: instance-state-name, Filter.1.Value.1: running, Filter.1.Value.2: pending, ) Headers: (User-Agent: aws-sdk-java/1.9.34 Linux/3.13.0-34-generic OpenJDK_64-Bit_Server_VM/24.65-b04/1.7.0_65, )
[2015-06-30 22:58:40,795][DEBUG][com.amazonaws.auth.AWS4Signer] AWS4 Canonical Request: '"POST
/

host:ec2.us-west-1.amazonaws.com
user-agent:aws-sdk-java/1.9.34 Linux/3.13.0-34-generic OpenJDK_64-Bit_Server_VM/24.65-b04/1.7.0_65
x-amz-date:20150630T225840Z
x-amz-security-token:AQoDYXdzEID//////////wEa4APrHVA50Yl2ogMlFP9ljZMFPA09O/waKMPbw/6/FEnfoL4SeCU4pIQK53vCVVO6Y7peRA/zROtD/gQcAh/aStTsIRG6odMPD9ZePlTuBb+AuL6XRxZbyveMa1lvNvLrKlhr/fZUbmf/HJKPwlJ2YqmU/yAfh+1ic1NurXCP2fsNn24EQgFrOKs6v8q4BDb2p2MKC9Kf8fHagybRvUUjz+3uva4Bs9jhgpDAAqlUVvYGF4kFi9HTDS1SpISINIu6OEMWl74OktRERUJMSD2IjOTWgYewCtGOVcKKGhfLxuoT4unBudkCsu9Wa65BLz6w2HcUEWAz9HbyXFSqx19uQsW3vPhJRSrQ1nG5UvHh1V7RQOuVK7HErCxvBYyD4wNMYdZmSZfuLuBuud7eMZ+w2lO5HhihPTUCC9RDhaTfhJTBHPnqlMtg7BGJgVyOLGQ8Xz5nFv3Rim2bF906loqFPqUcopWJlUnYlKHl42uAM2DR8wkyRP9j3J/GqnhVN1UR0uyjQ5FBqpjOu4BImDqiervXL9YNMoee6wik0yl74dc/ez5gHdFz3qmAcJK39nv5ii6LlopBY23+vbBAXwuTVj4sY5vxqrVg1retVQuNh+OMHNu7JqrFFcziAUxd1LMEWF8gxrHMrAU=

host;user-agent;x-amz-date;x-amz-security-token
9edf23cd1e60c05a5e520af7efea39ce7d9fa755ef3de8e4fee08134af02b65d"
[2015-06-30 22:58:40,797][DEBUG][com.amazonaws.auth.AWS4Signer] AWS4 String to Sign: '"AWS4-HMAC-SHA256
20150630T225840Z
20150630/us-west-1/ec2/aws4_request
11e49e611da4e207be6f2ff9c2b322c2ac6c3efb92eedc841a019a6e25fe11eb"
[2015-06-30 22:58:40,797][DEBUG][com.amazonaws.auth.AWS4Signer] Generating a new signing key as the signing key not available in the cache for the date 1435622400000  
[2015-06-30 22:58:40,964][DEBUG][com.amazonaws.http.conn.ssl.SdkTLSSocketFactory] socket.getSupportedProtocols(): [SSLv2Hello, SSLv3, TLSv1, TLSv1.1, TLSv1.2], socket.getEnabledProtocols(): [SSLv3, TLSv1]  
[2015-06-30 22:58:40,965][DEBUG][com.amazonaws.http.conn.ssl.SdkTLSSocketFactory] TLS protocol enabled for SSL handshake: [TLSv1.2, TLSv1.1, TLSv1, SSLv3]
[2015-06-30 22:58:40,966][DEBUG][com.amazonaws.http.conn.ssl.SdkTLSSocketFactory] connecting to ec2.us-west-1.amazonaws.com/204.246.163.231:443
[2015-06-30 22:58:41,243][DEBUG][com.amazonaws.internal.SdkSSLSocket] created: ec2.us-west-1.amazonaws.com/204.246.163.231:443  
[2015-06-30 22:58:41,268][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Attempt 1 to execute request
[2015-06-30 22:58:41,284][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Connection can be kept alive indefinitely
[2015-06-30 22:58:41,284][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Authentication required
[2015-06-30 22:58:41,285][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] ec2.us-west-1.amazonaws.com:443 requested authentication
[2015-06-30 22:58:41,285][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Response contains no authentication challenges
[2015-06-30 22:58:41,292][INFO ][com.amazonaws.http.AmazonHttpClient] x-amzn-RequestId: not available
[2015-06-30 22:58:41,363][DEBUG][com.amazonaws.request    ] Received error response: com.amazonaws.AmazonServiceException: AWS was not able to validate the provided access credentials (Service: null; Status Code: 401; Error Code: AuthFailure; Request ID: 0c3d874f-81be-4971-9c30-7ed57b7793d1)
[2015-06-30 22:58:41,363][INFO ][discovery.ec2            ] [bg-prod-elasticsearch-6] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 0c3d874f-81be-4971-9c30-7ed57b7793d1)  
[2015-06-30 22:58:41,364][DEBUG][discovery.ec2            ] [bg-prod-elasticsearch-6] Full exception:
com.amazonaws.AmazonServiceException: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 0c3d874f-81be-4971-9c30-7ed57b7793d1)
        at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1127)
        at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:743)
        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:462)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:297)
        at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:10903)
        at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5350)
        at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:110)
        at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:313)
        at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:219)
        at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:146)
        at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:124)
        at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:996)
        at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:360)
        at org.elasticsearch.discovery.zen.ZenDiscovery.access$6100(ZenDiscovery.java:85)
        at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1373)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
[2015-06-30 22:58:41,367][TRACE][discovery.zen.ping.unicast] [bg-prod-elasticsearch-6] [1] connecting to [bg-prod-elasticsearch-6][NU46ZB6NSvqnDRkvTzgq6w][ip-10-9-1-30.us-west-1.compute.internal][inet[/10.9.1.30:9300]]{aws_availability_zone=us-west-1a, max_local_storage_nodes=1}
[2015-06-30 22:58:41,398][TRACE][discovery.zen.ping.unicast] [bg-prod-elasticsearch-6] [1] connected to [bg-prod-elasticsearch-6][NU46ZB6NSvqnDRkvTzgq6w][ip-10-9-1-30.us-west-1.compute.internal][inet[/10.9.1.30:9300]]{aws_availability_zone=us-west-1a, max_local_storage_nodes=1}
[2015-06-30 22:58:41,398][TRACE][discovery.zen.ping.unicast] [bg-prod-elasticsearch-6] [1] sending to [bg-prod-elasticsearch-6][NU46ZB6NSvqnDRkvTzgq6w][ip-10-9-1-30.us-west-1.compute.internal][inet[/10.9.1.30:9300]]{aws_availability_zone=us-west-1a, max_local_storage_nodes=1}
[2015-06-30 22:58:41,426][TRACE][discovery.zen.ping.unicast] [bg-prod-elasticsearch-6] [1] received response from [bg-prod-elasticsearch-6][NU46ZB6NSvqnDRkvTzgq6w][ip-10-9-1-30.us-west-1.compute.internal][inet[/10.9.1.30:9300]]{aws_availability_zone=us-west-1a, max_local_storage_nodes=1}: [ping_response{node [[bg-prod-elasticsearch-6][NU46ZB6NSvqnDRkvTzgq6w][ip-10-9-1-30.us-west-1.compute.internal][inet[/10.9.1.30:9300]]{aws_availability_zone=us-west-1a, max_local_storage_nodes=1}], id[1], master [null], hasJoinedOnce [false], cluster_name[bg-prod-cluster]}, ping_response{node [[bg-prod-elasticsearch-6][NU46ZB6NSvqnDRkvTzgq6w][ip-10-9-1-30.us-west-1.compute.internal][inet[/10.9.1.30:9300]]{aws_availability_zone=us-west-1a, max_local_storage_nodes=1}], id[2], master [null], hasJoinedOnce [false], cluster_name[bg-prod-cluster]}]  
[2015-06-30 22:58:42,872][DEBUG][com.amazonaws.request    ] Sending Request: POST https://ec2.us-west-1.amazonaws.com / Parameters: (Action: DescribeInstances, Version: 2015-03-01, Filter.1.Name: instance-state-name, Filter.1.Value.1: running, Filter.1.Value.2: pending, ) Headers: (User-Agent: aws-sdk-java/1.9.34 Linux/3.13.0-34-generic OpenJDK_64-Bit_Server_VM/24.65-b04/1.7.0_65, )  
[2015-06-30 22:58:42,873][DEBUG][com.amazonaws.auth.AWS4Signer] AWS4 Canonical Request: '"POST
/

host:ec2.us-west-1.amazonaws.com
user-agent:aws-sdk-java/1.9.34 Linux/3.13.0-34-generic OpenJDK_64-Bit_Server_VM/24.65-b04/1.7.0_65
x-amz-date:20150630T225842Z
x-amz-security-token:AQoDYXdzEID//////////wEa4APrHVA50Yl2ogMlFP9ljZMFPA09O/waKMPbw/6/FEnfoL4SeCU4pIQK53vCVVO6Y7peRA/zROtD/gQcAh/aStTsIRG6odMPD9ZePlTuBb+AuL6XRxZbyveMa1lvNvLrKlhr/fZUbmf/HJKPwlJ2YqmU/yAfh+1ic1NurXCP2fsNn24EQgFrOKs6v8q4BDb2p2MKC9Kf8fHagybRvUUjz+3uva4Bs9jhgpDAAqlUVvYGF4kFi9HTDS1SpISINIu6OEMWl74OktRERUJMSD2IjOTWgYewCtGOVcKKGhfLxuoT4unBudkCsu9Wa65BLz6w2HcUEWAz9HbyXFSqx19uQsW3vPhJRSrQ1nG5UvHh1V7RQOuVK7HErCxvBYyD4wNMYdZmSZfuLuBuud7eMZ+w2lO5HhihPTUCC9RDhaTfhJTBHPnqlMtg7BGJgVyOLGQ8Xz5nFv3Rim2bF906loqFPqUcopWJlUnYlKHl42uAM2DR8wkyRP9j3J/GqnhVN1UR0uyjQ5FBqpjOu4BImDqiervXL9YNMoee6wik0yl74dc/ez5gHdFz3qmAcJK39nv5ii6LlopBY23+vbBAXwuTVj4sY5vxqrVg1retVQuNh+OMHNu7JqrFFcziAUxd1LMEWF8gxrHMrAU=

host;user-agent;x-amz-date;x-amz-security-token
9edf23cd1e60c05a5e520af7efea39ce7d9fa755ef3de8e4fee08134af02b65d"
[2015-06-30 22:58:42,875][DEBUG][com.amazonaws.auth.AWS4Signer] AWS4 String to Sign: '"AWS4-HMAC-SHA256
20150630T225842Z
20150630/us-west-1/ec2/aws4_request
34980b7237e4cdbfe29e53ecc1dc6936ea78aff9f47ef7b7e22d61624d6cb760"
[2015-06-30 22:58:42,877][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Stale connection check
[2015-06-30 22:58:42,880][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Attempt 1 to execute request
[2015-06-30 22:58:42,887][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Connection can be kept alive indefinitely
[2015-06-30 22:58:42,888][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Authentication required
[2015-06-30 22:58:42,888][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] ec2.us-west-1.amazonaws.com:443 requested authentication
[2015-06-30 22:58:42,888][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Response contains no authentication challenges
[2015-06-30 22:58:42,888][INFO ][com.amazonaws.http.AmazonHttpClient] x-amzn-RequestId: not available
[2015-06-30 22:58:42,893][DEBUG][com.amazonaws.request    ] Received error response: com.amazonaws.AmazonServiceException: AWS was not able to validate the provided access credentials (Service: null; Status Code: 401; Error Code: AuthFailure; Request ID: f9176931-528d-444c-b713-7d93c8698c0f)
[2015-06-30 22:58:42,893][INFO ][discovery.ec2            ] [bg-prod-elasticsearch-6] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: f9176931-528d-444c-b713-7d93c8698c0f)  
[2015-06-30 22:58:42,893][DEBUG][discovery.ec2            ] [bg-prod-elasticsearch-6] Full exception:
com.amazonaws.AmazonServiceException: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: f9176931-528d-444c-b713-7d93c8698c0f)
        at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1127)
        at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:743)
        at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:462)
        at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:297)
        at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:10903)
        at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5350)
        at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:110)
        at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:313)
        at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$2.doRun(UnicastZenPing.java:228)
        at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

This repeats one more time in the logs. What is interesting is that after these requests fail, it makes another request to the s3 api (with the same IAM based credentials) and it works fine:

[2015-06-30 22:58:44,833][DEBUG][com.amazonaws.internal.EC2MetadataClient] Connecting to EC2 instance metadata service at URL: http://169.254.169.254/latest/meta-data/iam/security-credentials/
[2015-06-30 22:58:44,834][DEBUG][com.amazonaws.internal.EC2MetadataClient] Connecting to EC2 instance metadata service at URL: http://169.254.169.254/latest/meta-data/iam/security-credentials/bg_prod_elasticsearch
[2015-06-30 22:58:44,837][DEBUG][com.amazonaws.auth.AWSCredentialsProviderChain] Loading credentials from InstanceProfileCredentialsProvider
[2015-06-30 22:58:44,837][DEBUG][com.amazonaws.request    ] Sending Request: HEAD https://bg-elasticsearch-backup.s3-us-west-1.amazonaws.com / Headers: (User-Agent: aws-sdk-java/1.9.34 Linux/3.13.0-34-generic OpenJDK_64-Bit_Server_VM/24.65-b04/1.7.0_65, Content-Type: application/x-www-form-urlencoded; charset=utf-8, )
[2015-06-30 22:58:44,874][DEBUG][com.amazonaws.services.s3.internal.S3Signer] Calculated string to sign:
"HEAD

application/x-www-form-urlencoded; charset=utf-8
Tue, 30 Jun 2015 22:58:44 GMT
x-amz-security-token:AQoDYXdzEID//////////wEa4APrHVA50Yl2ogMlFP9ljZMFPA09O/waKMPbw/6/FEnfoL4SeCU4pIQK53vCVVO6Y7peRA/zROtD/gQcAh/aStTsIRG6odMPD9ZePlTuBb+AuL6XRxZbyveMa1lvNvLrKlhr/fZUbmf/HJKPwlJ2YqmU/yAfh+1ic1NurXCP2fsNn24EQgFrOKs6v8q4BDb2p2MKC9Kf8fHagybRvUUjz+3uva4Bs9jhgpDAAqlUVvYGF4kFi9HTDS1SpISINIu6OEMWl74OktRERUJMSD2IjOTWgYewCtGOVcKKGhfLxuoT4unBudkCsu9Wa65BLz6w2HcUEWAz9HbyXFSqx19uQsW3vPhJRSrQ1nG5UvHh1V7RQOuVK7HErCxvBYyD4wNMYdZmSZfuLuBuud7eMZ+w2lO5HhihPTUCC9RDhaTfhJTBHPnqlMtg7BGJgVyOLGQ8Xz5nFv3Rim2bF906loqFPqUcopWJlUnYlKHl42uAM2DR8wkyRP9j3J/GqnhVN1UR0uyjQ5FBqpjOu4BImDqiervXL9YNMoee6wik0yl74dc/ez5gHdFz3qmAcJK39nv5ii6LlopBY23+vbBAXwuTVj4sY5vxqrVg1retVQuNh+OMHNu7JqrFFcziAUxd1LMEWF8gxrHMrAU=  
/bg-elasticsearch-backup/"
[2015-06-30 22:58:44,898][DEBUG][com.amazonaws.http.conn.ssl.SdkTLSSocketFactory] socket.getSupportedProtocols(): [SSLv2Hello, SSLv3, TLSv1, TLSv1.1, TLSv1.2], socket.getEnabledProtocols(): [SSLv3, TLSv1]
[2015-06-30 22:58:44,907][DEBUG][com.amazonaws.http.conn.ssl.SdkTLSSocketFactory] TLS protocol enabled for SSL handshake: [TLSv1.2, TLSv1.1, TLSv1, SSLv3]
[2015-06-30 22:58:44,907][DEBUG][com.amazonaws.http.conn.ssl.SdkTLSSocketFactory] connecting to bg-elasticsearch-backup.s3-us-west-1.amazonaws.com/54.231.237.0:443  
[2015-06-30 22:58:45,036][DEBUG][com.amazonaws.internal.SdkSSLSocket] created: bg-elasticsearch-backup.s3-us-west-1.amazonaws.com/54.231.237.0:443
[2015-06-30 22:58:45,036][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Attempt 1 to execute request
[2015-06-30 22:58:45,061][DEBUG][com.amazonaws.http.impl.client.SdkHttpClient] Connection can be kept alive indefinitely
[2015-06-30 22:58:45,061][INFO ][com.amazonaws.http.AmazonHttpClient] x-amzn-RequestId: not available
[2015-06-30 22:58:45,062][DEBUG][com.amazonaws.request    ] Received successful response: 200, AWS Request ID: 1E9B537BBEE55E70
[2015-06-30 22:58:45,062][INFO ][com.amazonaws.http.AmazonHttpClient] AWS Request ID: 1E9B537BBEE55E70

In my case I have double and triple checked the IAM role and it permissions, but nothing has worked. These same IAM permissions have been in use from 1.x thru 1.3.1 working good also.

Happy to help provide more info. Cheers

dadoonet commented 9 years ago

@cowboyrushforth Could you try to upgrade your openJDK version from 1.7.0_65 to 1.7.0_79?

rangagopalan commented 9 years ago

I'm seeing the same error - On startup there is an error message indicating that the cloud-aws plugin was not able to validate the provided credentials - verified that I can use the same access key/secret key combination to describe instances etc. via the Ec2 command line tools

This is happening after I upgraded ES from 1.4.2 to 1.7.1 and installed the latest cloud plugin version 2.7.0 (not sure which version of the cloud-aws plugin I had earlier) - This is a simple test cluster with just one node started now

OS: Ubuntu 14.04, JDK: Oracle build 1.7.0_80-b15

Here are my AWS key permissions:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "Stmt1390425084002",
            "Effect": "Allow",
            "Action": [
                "s3:GetObject",
                "s3:ListBucket",
                "s3:GetObjectVersion",
                "s3:PutObject"
            ],
            "Resource": [
                "arn:aws:s3:::confer-staging-es",
                "arn:aws:s3:::confer-staging-es/*"
            ]
        },
        {
            "Sid": "Stmt1410114664652",
            "Action": [
                "ec2:DescribeAvailabilityZones",
                "ec2:DescribeInstanceAttribute",
                "ec2:DescribeInstances",
                "ec2:DescribeRegions",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeTags"
            ],
            "Effect": "Allow",
            "Resource": "*"
        }
    ]
}

Error logs:

[2015-08-05 18:57:38,882][INFO ][discovery.ec2            ] [es2] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 0c07c8e9-a558-4a09-a743-485369c9622d)
[2015-08-05 18:57:38,882][DEBUG][discovery.ec2            ] [es2] Full exception:
com.amazonaws.AmazonServiceException: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 0c07c8e9-a558-4a09-a743-485369c9622d)
    at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1127)
    at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:743)
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:462)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:297)
    at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:10903)
    at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5350)
    at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:110)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:313)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$2.doRun(UnicastZenPing.java:228)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
[2015-08-05 18:57:38,883][TRACE][discovery.zen.ping.unicast] [es2] [1] sending to [es2][CoSg7K2iSmat9l0zdu7jjw][ip-10-225-165-53][inet[/10.225.165.53:9300]]{zone_id=useast1d}
[2015-08-05 18:57:38,886][TRACE][discovery.zen.ping.unicast] [es2] [1] received response from [es2][CoSg7K2iSmat9l0zdu7jjw][ip-10-225-165-53][inet[/10.225.165.53:9300]]{zone_id=useast1d}: [ping_response{node [[es2][CoSg7K2iSmat9l0zdu7jjw][ip-10-225-165-53][inet[/10.225.165.53:9300]]{zone_id=useast1d}], id[1], master [null], hasJoinedOnce [false], cluster_name[test_cluster]}, ping_response{node [[es2][CoSg7K2iSmat9l0zdu7jjw][ip-10-225-165-53][inet[/10.225.165.53:9300]]{zone_id=useast1d}], id[3], master [null], hasJoinedOnce [false], cluster_name[test_cluster]}, ping_response{node [[es2][CoSg7K2iSmat9l0zdu7jjw][ip-10-225-165-53][inet[/10.225.165.53:9300]]{zone_id=useast1d}], id[4], master [null], hasJoinedOnce [false], cluster_name[test_cluster]}]
[2015-08-05 18:57:40,403][INFO ][discovery.ec2            ] [es2] Exception while retrieving instance list from AWS API: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 95cc1bf2-e046-4581-bf93-3bf757855d37)
[2015-08-05 18:57:40,403][DEBUG][discovery.ec2            ] [es2] Full exception:
com.amazonaws.AmazonServiceException: AWS was not able to validate the provided access credentials (Service: AmazonEC2; Status Code: 401; Error Code: AuthFailure; Request ID: 95cc1bf2-e046-4581-bf93-3bf757855d37)
    at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1127)
    at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:743)
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:462)
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:297)
    at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:10903)
    at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5350)
    at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:110)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:313)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$2$1.doRun(UnicastZenPing.java:232)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:36)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
[2015-08-05 18:57:40,404][TRACE][discovery.zen.ping.unicast] [es2] [1] sending to [es2][CoSg7K2iSmat9l0zdu7jjw][ip-10-225-165-53][inet[/10.225.165.53:9300]]{zone_id=useast1d}
[2015-08-05 18:57:40,405][TRACE][discovery.zen.ping.unicast] [es2] [1] received response from [es2][CoSg7K2iSmat9l0zdu7jjw][ip-10-225-165-53][inet[/10.225.165.53:9300]]{zone_id=useast1d}: [ping_response{node [[es2][CoSg7K2iSmat9l0zdu7jjw][ip-10-225-165-53][inet[/10.225.165.53:9300]]{zone_id=useast1d}], id[1], master [null], hasJoinedOnce [false], cluster_name[test_cluster]}, ping_response{node [[es2][CoSg7K2iSmat9l0zdu7jjw][ip-10-225-165-53][inet[/10.225.165.53:9300]]{zone_id=useast1d}], id[3], master [null], hasJoinedOnce [false], cluster_name[test_cluster]}, ping_response{node [[es2][CoSg7K2iSmat9l0zdu7jjw][ip-10-225-165-53][inet[/10.225.165.53:9300]]{zone_id=useast1d}], id[5], master [null], hasJoinedOnce [false], cluster_name[test_cluster]}, ping_response{node [[es2][CoSg7K2iSmat9l0zdu7jjw][ip-10-225-165-53][inet[/10.225.165.53:9300]]{zone_id=useast1d}], id[6], master [null], hasJoinedOnce [false], cluster_name[test_cluster]}]
rangagopalan commented 8 years ago

Looks like this was an issue with the clock being out of sync on my AWS instance -Came across this - http://stackoverflow.com/questions/24744205/ec2-api-error-validating-access-credential - and updated my instance clock with ntp and then the plugin starts working fine -

jalateras commented 8 years ago

@rangagopalan Thanks for that. I'll try it on the instance i had issues with.

cowboyrushforth commented 8 years ago

I can confirm this issue was due to NTP. In my case a firewall restriction was added preventing NTP from working. Once I corrected the clock across the whole cluster the upgrade works as expected.

dadoonet commented 8 years ago

Closing this. Basically we know that 2 issues could cause that: