mitchellh / vagrant-aws

Use Vagrant to manage your EC2 and VPC instances.
MIT License
2.61k stars 572 forks source link

Infinite Loop on SSH when connection failed - exception handling improvements #461

Open cappetta opened 8 years ago

cappetta commented 8 years ago

I started using this plugin recently to integrate vagrant and AWS. I'm comparing this plugin to the vagrant-openstack-provider plugin.

I initially saw the AWS plugin hanging and found a few exception handling issues that could be improved after confirming the instance is up and I can ssh into the box manually so I turned on debugging.

What I'm seeing from the verbose output is that the authentication is failing (user error) yet the plugin is infinitely looping and attempting to reconnect.

==> default: Waiting for instance to become "ready"...
==> default: Waiting for SSH to become available...
  1. No SSH timeout - e.g. after 300 seconds of trying to ssh break out of ssh login loop and display error
  2. Authentication failures (e.g. wrong user name) are breaking the login loop when SSH is available. (code snippet below)
  3. AWS KeyPair Name Parameter - maybe an enhancement but I think it should check current directory instead of creating a new insecure private key?
  4. SSH Connection Timeout - These messages are not being displayed on the terminal like the ones in the openstack provider do (e.g. every attempt is logged) - this leaves the user wondering what is going on behind the scenes. Please log the messages to console and leverage suggestion #1 example use case: Missing security groups (e.g. no port 22 access) would prevent a successful connection / identification of the ssh server
DEBUG ssh: Checking key permissions: /tmp/vagrant_cappetta_us-west-1.pem
 INFO ssh: Attempting SSH connection...
 INFO ssh: Attempting to connect to SSH...
 INFO ssh:   - Host: ec2-xx-xxx-xxx-xxx.us-west-1.compute.amazonaws.com
 INFO ssh:   - Port: 22
 INFO ssh:   - Username: vagrant
 INFO ssh:   - Password? false
 INFO ssh:   - Key Path: ["/tmp/vagrant_cappetta_us-west-1.pem"]
DEBUG ssh: == Net-SSH connection debug-level log START ==
DEBUG ssh: D, [2016-05-16T16:40:14.448647 #12618] DEBUG -- net.ssh.transport.session[10c2dfc]: establishing connection to ec2-xx-xxx-xxx-xxx.us-west-1.compute.amazonaws.com:22
D, [2016-05-16T16:40:14.528106 #12618] DEBUG -- net.ssh.transport.session[10c2dfc]: connection established
I, [2016-05-16T16:40:14.528280 #12618]  INFO -- net.ssh.transport.server_version[10c24b0]: negotiating protocol version
D, [2016-05-16T16:40:14.611099 #12618] DEBUG -- net.ssh.transport.server_version[10c24b0]: remote is `SSH-2.0-OpenSSH_6.6.1'
D, [2016-05-16T16:40:14.611350 #12618] DEBUG -- net.ssh.transport.server_version[10c24b0]: local is `SSH-2.0-Ruby/Net::SSH_3.0.2 x86_64-linux'
D, [2016-05-16T16:40:14.693978 #12618] DEBUG -- socket[10c2a64]: read 1640 bytes
D, [2016-05-16T16:40:14.694171 #12618] DEBUG -- socket[10c2a64]: received packet nr 0 type 20 len 1636
I, [2016-05-16T16:40:14.694241 #12618]  INFO -- net.ssh.transport.algorithms[10b9874]: got KEXINIT from server
I, [2016-05-16T16:40:14.694379 #12618]  INFO -- net.ssh.transport.algorithms[10b9874]: sending KEXINIT
D, [2016-05-16T16:40:14.694560 #12618] DEBUG -- socket[10c2a64]: queueing packet nr 0 type 20 len 1684
D, [2016-05-16T16:40:14.694675 #12618] DEBUG -- socket[10c2a64]: sent 1688 bytes
I, [2016-05-16T16:40:14.694706 #12618]  INFO -- net.ssh.transport.algorithms[10b9874]: negotiating algorithms
D, [2016-05-16T16:40:14.694830 #12618] DEBUG -- net.ssh.transport.algorithms[10b9874]: negotiated:
* kex: diffie-hellman-group-exchange-sha1
* host_key: ssh-rsa
* encryption_server: aes128-cbc
* encryption_client: aes128-cbc
* hmac_client: hmac-sha1
* hmac_server: hmac-sha1
* compression_client: none
* compression_server: none
* language_client: 
* language_server: 
D, [2016-05-16T16:40:14.694854 #12618] DEBUG -- net.ssh.transport.algorithms[10b9874]: exchanging keys
D, [2016-05-16T16:40:14.695048 #12618] DEBUG -- socket[10c2a64]: queueing packet nr 1 type 34 len 20
D, [2016-05-16T16:40:14.695126 #12618] DEBUG -- socket[10c2a64]: sent 24 bytes
D, [2016-05-16T16:40:14.893071 #12618] DEBUG -- socket[10c2a64]: read 152 bytes
D, [2016-05-16T16:40:14.893233 #12618] DEBUG -- socket[10c2a64]: received packet nr 1 type 31 len 148
D, [2016-05-16T16:40:14.894742 #12618] DEBUG -- socket[10c2a64]: queueing packet nr 2 type 32 len 140
D, [2016-05-16T16:40:14.894845 #12618] DEBUG -- socket[10c2a64]: sent 144 bytes
D, [2016-05-16T16:40:14.975973 #12618] DEBUG -- socket[10c2a64]: read 720 bytes
D, [2016-05-16T16:40:14.976479 #12618] DEBUG -- socket[10c2a64]: received packet nr 2 type 33 len 700
D, [2016-05-16T16:40:14.980625 #12618] DEBUG -- socket[10c2a64]: queueing packet nr 3 type 21 len 20
D, [2016-05-16T16:40:14.981099 #12618] DEBUG -- socket[10c2a64]: sent 24 bytes
D, [2016-05-16T16:40:14.981611 #12618] DEBUG -- socket[10c2a64]: received packet nr 3 type 21 len 12
D, [2016-05-16T16:40:14.983010 #12618] DEBUG -- net.ssh.authentication.session[e1602c]: beginning authentication of `vagrant'
D, [2016-05-16T16:40:14.983935 #12618] DEBUG -- socket[10c2a64]: queueing packet nr 4 type 5 len 28
D, [2016-05-16T16:40:14.984482 #12618] DEBUG -- socket[10c2a64]: sent 52 bytes
D, [2016-05-16T16:40:15.175809 #12618] DEBUG -- socket[10c2a64]: read 52 bytes
D, [2016-05-16T16:40:15.176392 #12618] DEBUG -- socket[10c2a64]: received packet nr 4 type 6 len 28
D, [2016-05-16T16:40:15.176918 #12618] DEBUG -- net.ssh.authentication.session[e1602c]: trying none
D, [2016-05-16T16:40:15.177482 #12618] DEBUG -- socket[10c2a64]: queueing packet nr 5 type 50 len 44
D, [2016-05-16T16:40:15.177950 #12618] DEBUG -- socket[10c2a64]: sent 68 bytes
D, [2016-05-16T16:40:15.775370 #12618] DEBUG -- socket[10c2a64]: read 52 bytes
D, [2016-05-16T16:40:15.776011 #12618] DEBUG -- socket[10c2a64]: received packet nr 5 type 51 len 28
D, [2016-05-16T16:40:15.776433 #12618] DEBUG -- net.ssh.authentication.session[e1602c]: allowed methods: publickey
D, [2016-05-16T16:40:15.776731 #12618] DEBUG -- net.ssh.authentication.methods.none[e12e90]: none failed
D, [2016-05-16T16:40:15.776980 #12618] DEBUG -- net.ssh.authentication.session[e1602c]: trying publickey
D, [2016-05-16T16:40:15.777918 #12618] DEBUG -- net.ssh.authentication.agent[e0f560]: connecting to ssh-agent
D, [2016-05-16T16:40:15.778494 #12618] DEBUG -- net.ssh.authentication.agent[e0f560]: sending agent request 1 len 44
D, [2016-05-16T16:40:15.779814 #12618] DEBUG -- net.ssh.authentication.agent[e0f560]: received agent packet 2 len 5
D, [2016-05-16T16:40:15.780027 #12618] DEBUG -- net.ssh.authentication.agent[e0f560]: sending agent request 11 len 0
D, [2016-05-16T16:40:15.792473 #12618] DEBUG -- net.ssh.authentication.agent[e0f560]: received agent packet 12 len 307
D, [2016-05-16T16:40:15.794018 #12618] DEBUG -- net.ssh.authentication.methods.publickey[e0f830]: trying publickey (cc:6e:6d:36:89:ea:42:32:b4:e6:6e:11:90:00:0b:92)
D, [2016-05-16T16:40:15.794794 #12618] DEBUG -- socket[10c2a64]: queueing packet nr 6 type 50 len 348
D, [2016-05-16T16:40:15.795248 #12618] DEBUG -- socket[10c2a64]: sent 372 bytes
D, [2016-05-16T16:40:15.875564 #12618] DEBUG -- socket[10c2a64]: read 52 bytes
D, [2016-05-16T16:40:15.875772 #12618] DEBUG -- socket[10c2a64]: received packet nr 6 type 51 len 28
D, [2016-05-16T16:40:15.875873 #12618] DEBUG -- net.ssh.authentication.session[e1602c]: allowed methods: publickey
E, [2016-05-16T16:40:15.875924 #12618] ERROR -- net.ssh.authentication.session[e1602c]: all authorization methods failed (tried none, publickey)

DEBUG ssh: == Net-SSH connection debug-level log END ==
 INFO ssh: SSH not up: #<Vagrant::Errors::SSHAuthenticationFailed: SSH authentication failed! This is typically caused by the public/private
keypair for the SSH user not being properly set on the guest VM. Please
verify that the guest VM is setup with the proper public key, and that
the private key path for Vagrant is setup properly as well.>
DEBUG ssh: Checking whether SSH is ready...
fullstackbigdata commented 7 years ago

Make sure you are setting override.ssh.username = "ec2-user" or whatever user you would be using ssh as.

cappetta commented 7 years ago

+1 on username overrides. I've added the block below to manage that for me

# If default user defined otherwise ec2-user if node["user"] override.ssh.username = node["user"] else override.ssh.username = "ec2-user" end I've expanded a bit into my usage since this was reported last May. I'm sharing a link to my vagrant_aws_bootstrap repo but this issue still exists. If you can point me in the right direction, code-deep-dive-wise, I'll dive into the code & trace through the logic w/ the goal of identifying where a break can be inserted.