Closed aphexyuri closed 1 year ago
Have you installed the Ansible requirements?
cd ansible
ansible-galaxy install -r requirements.yml
@tinom9 that did the trick., it can't seem to reach the nodes and running alias ansible='ansible --inventory inventory/aws_ec2.yml --vault-password-file=password.txt --extra-vars "@local-extra-vars.yml"' ansible -m all ping
give me:
(base) ➜ ansible git:(main) ✗ alias ansible='ansible --inventory inventory/aws_ec2.yml --vault-password-file=password.txt --extra-vars "@local-extra-vars.yml"'
ansible -m all ping
[WARNING]: Could not match supplied host pattern, ignoring: ping
[WARNING]: No hosts matched, nothing to do
inventory looks okay:
(base) ➜ ansible git:(main) ✗ ansible-inventory --graph
@all:
|--@ungrouped:
|--@aws_ec2:
| |--i-07d77f8eeb0b41523
| |--i-084e52f1f7ed26860
| |--i-0a3c97f08afce6885
| |--i-0d6132a33909c748f
|--@validator:
| |--i-07d77f8eeb0b41523
| |--i-084e52f1f7ed26860
| |--i-0a3c97f08afce6885
| |--i-0d6132a33909c748f
|--@devnet01_edge_rg_private:
| |--i-07d77f8eeb0b41523
| |--i-084e52f1f7ed26860
| |--i-0a3c97f08afce6885
| |--i-0d6132a33909c748f
|--@validator_001:
| |--i-07d77f8eeb0b41523
|--@validator_004:
| |--i-084e52f1f7ed26860
|--@validator_002:
| |--i-0a3c97f08afce6885
|--@validator_003:
| |--i-0d6132a33909c748f
Try ansible all -m ping
:)
Great, ty! Ping runs but seems like the instances aren't reachable:
(base) ➜ ansible git:(main) ✗ ansible all -m ping
i-084e52f1f7ed26860 | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: kex_exchange_identification: Connection closed by remote host\r\nConnection closed by UNKNOWN port 65535",
"unreachable": true
}
i-07d77f8eeb0b41523 | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: kex_exchange_identification: Connection closed by remote host\r\nConnection closed by UNKNOWN port 65535",
"unreachable": true
}
i-0d6132a33909c748f | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: kex_exchange_identification: Connection closed by remote host\r\nConnection closed by UNKNOWN port 65535",
"unreachable": true
}
i-0a3c97f08afce6885 | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: kex_exchange_identification: Connection closed by remote host\r\nConnection closed by UNKNOWN port 65535",
"unreachable": true
}
I do see them all running in the AWS console though.
I'd say it's not getting the right ssh
key. Make sure it's accessible and you've set it up properly.
You can always test it by connecting to a validator instance with the specified params:
ssh -i $SSH_KEY_FILE ubuntu@$VALIDATOR_01_INSTANCE_ID \
-o IdentitiesOnly=yes \
-o StrictHostKeyChecking=no \
-o ProxyCommand="sh -c \"aws ssm start-session --target %h --document-name AWS-StartSSHSession --parameters 'portNumber=%p'\""
What should I use for VALIDATOR_01_INSTANCE_ID
?
On a side note, there's a discrepancy in step 7 with the private key location ~/.ssh/
vs ~/cert/
paths - I made it all ~/.ssh/
(also ansible_ssh_private_key_file: ~/.ssh/devnet_private.key
in local-extra-vars.yml
)
@tinom9 can i ask what OS you're using? was just trying our run.sh w/o success:
[WARNING]: * Failed to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml with auto plugin: Failed
to describe instances: An error occurred (UnauthorizedOperation) when calling the DescribeInstances operation: You are not authorized to
perform this operation.
[WARNING]: * Failed to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml with yaml plugin: Plugin
configuration YAML file, not YAML inventory
[WARNING]: * Failed to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml with ini plugin: Invalid
host pattern '---' supplied, '---' is normally a sign this is a YAML file.
[WARNING]: Unable to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml as an inventory source
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
Starting galaxy collection install process
Nothing to do. All requested collections are already installed. If you want to reinstall them, consider using `--force`.
[WARNING]: * Failed to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml with auto plugin: Failed
to describe instances: An error occurred (UnauthorizedOperation) when calling the DescribeInstances operation: You are not authorized to
perform this operation.
[WARNING]: * Failed to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml with yaml plugin: Plugin
configuration YAML file, not YAML inventory
[WARNING]: * Failed to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml with ini plugin: Invalid
host pattern '---' supplied, '---' is normally a sign this is a YAML file.
[WARNING]: Unable to parse /Users/yurivisser/Desktop/terraform-polygon-supernets/ansible/inventory/aws_ec2.yml as an inventory source
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
[WARNING]: Collection prometheus.prometheus does not support Ansible version 2.14.4
PLAY [all] *********************************************************************************************************************************
skipping: no hosts matched
[WARNING]: Could not match supplied host pattern, ignoring: devnet01_edge_polygon_private
PLAY [all:&devnet01_edge_polygon_private] **************************************************************************************************
skipping: no hosts matched
[WARNING]: Could not match supplied host pattern, ignoring: geth
PLAY [geth:&devnet01_edge_polygon_private] *************************************************************************************************
skipping: no hosts matched
[WARNING]: Could not match supplied host pattern, ignoring: fullnode
[WARNING]: Could not match supplied host pattern, ignoring: validator
PLAY [fullnode:validator:&devnet01_edge_polygon_private] ***********************************************************************************
skipping: no hosts matched
PLAY [fullnode:validator:&devnet01_edge_polygon_private] ***********************************************************************************
skipping: no hosts matched
PLAY [fullnode:validator:&devnet01_edge_polygon_private] ***********************************************************************************
skipping: no hosts matched
PLAY RECAP *********************************************************************************************************************************
are you still stuck on this? please confirm if ansible all -m ping
is working
I am actually stuck at the same place. ansible all -m ping
returning
i-0cb0636a89d0d03cf | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname i-0cb0636a89d0d03cf: nodename nor servname provided, or not known",
"unreachable": true
}
i-023e1176957aa6b8b | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname i-023e1176957aa6b8b: nodename nor servname provided, or not known",
"unreachable": true
}
i-04f517e27dc91d8e7 | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname i-04f517e27dc91d8e7: nodename nor servname provided, or not known",
"unreachable": true
}
i-0c07fca0d42ae4147 | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname i-0c07fca0d42ae4147: nodename nor servname provided, or not known",
"unreachable": true
}
i-01766b8b0e3f5d461 | UNREACHABLE! => {
"changed": false,
"msg": "Failed to connect to the host via ssh: ssh: Could not resolve hostname i-01766b8b0e3f5d461: nodename nor servname provided, or not known",
"unreachable": true
}
Any ideas?
I was running into the same issue and was able to make some progress. The main issue i found was an AWS permission problem where i had to add a dedicated policy under my AWS IAM user. Its specific permission for the 4 validators and the geth-001 node. I also had to install session manager since along the way i got another error (https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-working-with-install-plugin.html)
{ "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "ssm:StartSession" ], "Resource": [ "arn:aws:ec2:us-west-2:010531221017:instance/i-0c21d328536a23815", "arn:aws:ec2:us-west-2:010531221017:instance/i-029da648f72eca49e", "arn:aws:ec2:us-west-2:010531221017:instance/i-05c1dc647f0ab13ca", "arn:aws:ec2:us-west-2:010531221017:instance/i-07cc225ea7b1f8cfd", "arn:aws:ec2:us-west-2:010531221017:instance/i-0efe99c6be33073d4", "arn:aws:ssm:us-west-2::document/AWS-StartSSHSession" ] }, { "Effect": "Allow", "Action": [ "ssm:TerminateSession", "ssm:ResumeSession" ], "Resource": [ "arn:aws:ssm:::session/${aws:username}-*" ] } ] }
It's going to be important to use the full command:
ansible --inventory inventory/aws_ec2.yml --extra-vars "@local-extra-vars.yml" all -m ping
In local-extra-vars.yml
there are some lines like this:
ansible_ssh_private_key_file: ~/devnet_private.key
ansible_ssh_common_args: >
-o IdentitiesOnly=yes
-o StrictHostKeyChecking=no
-o ProxyCommand="sh -c \"aws ssm start-session --target %h --document-name AWS-StartSSHSession --parameters 'portNumber=%p'\""
These can be edited based on your needs, but basically it would tell Ansible where to find your ssh key and also configure Ansible to use SSH over SSM.
Finally got back to giving this a try, with the latest edge release and additions & changes to the docs. Turned out my aws ssm setup and setting vars (example.env step) was broken. Ty and great work on making the docs clearer and adding some verification commands along the way!
Awesome - thanks @aphexyuri - We'll be adding more documentation around tuning, loading testing, and regular operations (e.g. looking at logs, etc). If you have any other thoughts of documentation that would be helpful, feel free to drop us a line.
Firstly, thanks for the work on the TF/Ansible deployment. I did however run in to an issue at step 6 with the following:
Hoping it's a simple fix or something I'm missing; perhaps some steps required to set up Prometheus. Help would be greatly appreciated.