hashicorp / terraform-aws-consul

A Terraform Module for how to run Consul on AWS using Terraform and Packer
Apache License 2.0
401 stars 488 forks source link

`run-consul` fails to lookup tags for instance #63

Closed karma0 closed 6 years ago

karma0 commented 6 years ago

In attempting to run a consul cluster, I'm receiving the following output from the user-data script that calls run-consul:

HTTPSConnectionPool(host='ec2.us-east-1.amazonaws.com', port=443): Max retries exceeded with url: / (Caused by ConnectTimeoutError(<requests.packages.urllib3.connection.VerifiedHTTPSConnection object at 0x7f67b5423278>, 'Connection to ec2.us-east-1.amazonaws.com timed out. (connect timeout=60)'))
2018-04-24 22:24:11 [WARN] [run-consul] This Instance i-09318afe7d984d50f in us-east-1 does not have any Tags.
2018-04-24 22:24:11 [WARN] [run-consul] Will sleep for 10 seconds and try again.

Executing run-consul by hand reveals that it will consistently hang and timeout while trying to contact ec2.us-east-1.amazonaws.com. I am running the instances on a few private VPC subnets that can communicate with one another, but are relatively locked down otherwise.

Are the tags necessary? If so, what do I need to do to configure terraform to allow access to ec2.us-east-1.amazonaws.com? Finally, are there other services/hosts that run-consul needs access to from these VPC/subnets?

brikis98 commented 6 years ago

I am running the instances on a few private VPC subnets that can communicate with one another, but are relatively locked down otherwise.

Do you allow outbound Internet access? If not, then run-consul won't be able to talk to the AWS APIs to find and connect to the other Consul servers.

karma0 commented 6 years ago

Added a NAT gateway, and that seemed to resolve the connection issue. Now I'm getting the following:

2018-04-25 23:52:20 [INFO] [run-consul] Looking up tags for Instance i-021d2c40784a70355 in us-east-1

An error occurred (UnauthorizedOperation) when calling the DescribeTags operation: You are not authorized to perform this operation.
2018-04-25 23:52:20 [WARN] [run-consul] This Instance i-021d2c40784a70355 in us-east-1 does not have any Tags.
2018-04-25 23:52:20 [WARN] [run-consul] Will sleep for 10 seconds and try again.
brikis98 commented 6 years ago

Could you check if that Instance has tags? If not, are you running it with the consul-cluster module or some other way?

karma0 commented 6 years ago

It is consul-cluster. Well, technically I'm trying to run a vault-cluster and nomad-cluster, and I've integrated consul-cluster into the mix, executing run-consul in the user-data.

In answer to your question, there are tags on the servers. You can see what I'm working on here, on the master branch. The README is incomplete, so the files may be more valuable.

brikis98 commented 6 years ago

Oh, sorry, I missed the obvious error message:

An error occurred (UnauthorizedOperation) when calling the DescribeTags operation: You are not authorized to perform this operation.

To look up tags, your servers need an IAM role with the DescribeTags IAM permission. That's created by the consul-iam-policies module which is added here in the consul-cluster module: https://github.com/hashicorp/terraform-aws-consul/blob/master/modules/consul-cluster/main.tf#L202-L206.

Did you change anything about the IAM role or policies?

karma0 commented 6 years ago

That worked! My setup was hacked together a bit so that there's some overlap in services with nomad-cluster and vault-cluster, and I'm using a single AMI, but adding iam_policies for both of my clusters resolved the issue!