Closed marcofiocco closed 3 years ago
@marcofiocco This error
agent: Cannot discover address: cluster=LAN address="provider=aws tag_key=ConsulAutoJoin tag_value=auto-join" error="discover-aws: GetInstanceIdentityDocument failed: EC2MetadataRequestError: failed to get EC2 instance identity document
is probably because you are missing an IAM role policy.
Try to add the bellow policy to the policy attached to the instance.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "",
"Effect": "Allow",
"Action": [
"ec2:DescribeTags",
"ec2:DescribeInstances",
"autoscaling:DescribeAutoScalingGroups"
],
"Resource": "*"
}
]
}
The problem might have been with my AMI. I was using an AMI that I've created from an instance without the correct procedure (I was shutting down the instance without Sysprep). In that way some subnets were wrong. This was also impeding to boot instances with user_data. Then I did the correct procedure (with Sysprep) and run the same Consul configuration over a new instance from the new AMI and now magically it works.
Hi @marcofiocco,
I'm closing this issue as it seems like you resolved it and that it wasn't caused by Consul. Please reply if I misunderstood. Thanks!
Overview of the Issue
I have created a cluster on AWS using https://github.com/hashicorp/nomad-autoscaler. The Ubuntu server and client nodes work fine, they can find each other. Now I have a Windows 2016 instance on AWS (in the same subnet of a Linux client), where I have installed nomad and consul. Nomad should join the servers thanks to Consul using auto join as the Linux clients do, but it does not work in this Windows instance. Note that I’ve tagged the AWS instance with
ConsulAutoJoin
=auto-join
already.Reproduction Steps
The Consul HCL is (using the IP of the Windows instance):
Consul info for both Client and Server
Client info
``` agent: check_monitors = 0 check_ttls = 0 checks = 1 services = 1 build: prerelease = revision = 12b16df3 version = 1.8.4 consul: acl = disabled known_servers = 0 server = false runtime: arch = amd64 cpu_count = 4 goroutines = 51 max_procs = 4 os = windows version = go1.14.6 serf_lan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 1 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 2 members = 1 query_queue = 0 query_time = 1 ```Server info
``` agent: check_monitors = 0 check_ttls = 0 checks = 3 services = 4 build: prerelease = revision = 12b16df3 version = 1.8.4 consul: acl = disabled bootstrap = false known_datacenters = 1 leader = false leader_addr = 10.241.238.204:8300 server = true raft: applied_index = 173 commit_index = 173 fsm_pending = 0 last_contact = 76.814564ms last_log_index = 173 last_log_term = 2 last_snapshot_index = 0 last_snapshot_term = 0 latest_configuration = [{Suffrage:Voter ID:8e9862e8-54ac-7595-84a9-08de035cbbca Address:10.241.238.204:8300} {Suffrage:Voter ID:aac8650d-1d5f-d4e0-3fb5-ce0ee19469f9 Address:10.241.238.236:8300} {Suffrage:Voter ID:557417e6-d3c3-8cd7-3b79-2dcd50067feb Address:10.241.239.15:8300}] latest_configuration_index = 0 num_peers = 2 protocol_version = 3 protocol_version_max = 3 protocol_version_min = 0 snapshot_version_max = 1 snapshot_version_min = 0 state = Follower term = 2 runtime: arch = amd64 cpu_count = 2 goroutines = 96 max_procs = 2 os = linux version = go1.14.6 serf_lan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 2 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 4 members = 4 query_queue = 0 query_time = 1 serf_wan: coordinate_resets = 0 encrypted = false event_queue = 0 event_time = 1 failed = 0 health_score = 0 intent_queue = 0 left = 0 member_time = 4 members = 3 query_queue = 0 query_time = 1 ```Operating system and Environment details
Windows 2016 arm64
Log Fragments