mitchellh / vagrant-aws

Use Vagrant to manage your EC2 and VPC instances.
MIT License
2.61k stars 573 forks source link

Failing with EC2 Amazon Linux Images? #72

Open ProbablyRusty opened 11 years ago

ProbablyRusty commented 11 years ago

I have no problems firing up Ubuntu instances on EC2 with vagrant-aws.

However, when I try to bring up Amazon Linux images, I get an error every time, which seems to be some problem related to sudo on the guest VM.

For example, based on this Vagrantfile:

Vagrant.configure("2") do |config|

  config.vm.box = "testbox"
  config.vm.box_url = "https://github.com/mitchellh/vagrant-aws/raw/master/dummy.box"

    config.vm.provider :aws do |aws, override|
        aws.ami = "ami-05355a6c"
        override.ssh.username = "ec2-user"
    end
end

Here is the failed output trying to bring up the instance:

$ vagrant up --provider aws
Bringing machine 'default' up with 'aws' provider...
[default] Warning! The AWS provider doesn't support any of the Vagrant
high-level network configurations (`config.vm.network`). They
will be silently ignored.
[default] Launching an instance with the following settings...
[default]  -- Type: m1.small
[default]  -- AMI: ami-05355a6c
[default]  -- Region: us-east-1
[default]  -- Keypair: mykey
[default] Waiting for instance to become "ready"...
[default] Waiting for SSH to become available...
[default] Machine is booted and ready for use!
[default] Rsyncing folder: /Users/tester/ => /vagrant
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

mkdir -p '/vagrant'

If I try it again with debugging on, here is what I think is the relevant output:

 INFO connect_aws: Connecting to AWS...
 INFO warden: Calling action: #<VagrantPlugins::AWS::Action::ReadSSHInfo:0x00000100f5d920>
 INFO interface: info: Rsyncing folder: /Users/tester/ => /vagrant
[default] Rsyncing folder: /Users/tester/ => /vagrant
DEBUG ssh: Re-using SSH connection.
 INFO ssh: Execute: mkdir -p '/vagrant' (sudo=true)
DEBUG ssh: stderr: sudo
DEBUG ssh: stderr: : 
DEBUG ssh: stderr: sorry, you must have a tty to run sudo
DEBUG ssh: stderr: 

Any thoughts?

Again, if I change only aws.ami to any Ubuntu flavored image (and override.ssh.username to 'ubuntu', too, of course) in the Vagrantfile pasted above, this works every time.

gschueler commented 11 years ago

Have the same problem with Centos 6.3 official ami (ami-a6e15bcf), which uses 'root' as the ssh user.

Debug log shows the same problem with sudo.

ProbablyRusty commented 11 years ago

The README changes in this pull request seem to address a workaround for this very issue:

https://github.com/mitchellh/vagrant-aws/pull/70/files

gschueler commented 11 years ago

Thanks for the semi-workaround. It seems to sort-of work for Amazon Linux (which supports cloud-init). vagrant up sometimes still fails to do the rsync, but a subsequent vagrant provision does succeed, so I think the rsync and the cloud-init script are in a race condition.

It would be nice if there was a way to specify an inline shell command to run prior to the rsync step, that would allow us to perform an action similar to a cloud-init step, for amis that don't support it.

gondo commented 11 years ago

+1 for support Amazon OS and its default setup using ec2-user

miguno commented 11 years ago

Looks like I found a more reliable way to fix the problem. Use a boothook to perform the fix. I manually confirmed that a script passed as a boothook is executed before Vagrant's rsync phase starts. So far it has been working reliably for me, and I don't need to create a custom AMI image.

Extra tip: And if you are relying on cloud-config, too, you can create a Mime Multi Part Archive to combine the boothook and the cloud-config. You can get the latest version of the write-mime-multipart helper script from GitHub.

Usage sketch:

$ cd /tmp
$ wget https://raw.github.com/lovelysystems/cloud-init/master/tools/write-mime-multipart
$ chmod +x write-mime-multipart
$ cat boothook.sh
#!/bin/bash
SUDOERS_FILE=/etc/sudoers.d/999-vagrant-cloud-init-requiretty
echo "Defaults:ec2-user !requiretty" > $SUDOERS_FILE
echo "Defaults:root !requiretty" >> $SUDOERS_FILE
chmod 440 $SUDOERS_FILE

$ cat cloud-config
#cloud-config

packages:
  - puppet
  - git
  - python-boto

$ write-mime-multipart boothook.sh cloud-config > combined.txt

You can then pass the contents of 'combined.txt' to aws.user_data, for instance via:

aws.user_data = File.read("/tmp/combined.txt")
miguno commented 11 years ago

Addendum: Bad news. It looks as if even a boothook will not be "faster" to run (to update /etc/sudoers.d/ for !requiretty) than Vagrant is trying to rsync. During my testing today I started seeing sporadic "mkdir -p /vagrant" errors again when running vagrant up --no-provision.

der commented 11 years ago

Is it really necessary to change the requiretty settings?

I can script this without vagrant and without modifying sudoers by using the ssh "-t -t" option to force pseudo tty allocation (note that you do need to repeat the "-t" flag). Does/could the "env[:machine].communicate.sudo" method support passing such flags? I'm not sufficiently familiar with the Vagrant code base to find where that is implemented.

jeffbyrnes commented 10 years ago

I'd be interested in a solution more along the lines that @der is suggesting as well. Modifying /etc/sudoers seems kludgy, though if that's the accepted solution, it'd be nice if it was well-documented in the README or the wiki.

chuyskywalker commented 10 years ago

Just a +1 from me after running into this today.

chuyskywalker commented 10 years ago

For the time being, though:

#!/bin/bash

# for debug if you want:
#set -x

# Remove old instance
vagrant destroy --force

# start up the new one, but don't bother to provision, it's going to fail
vagrant up --provider=aws --no-provision

# loop over the provision call until either it works or we've given up (10sec sleep, 12 tries = ~ 2-3 minutes)
count=0
while :
do
    vagrant provision
    code=$?
    if [ $code -eq 0 ]; then
       exit 0
    fi
    sleep 10
    let count++
    if [ $count -gt 12 ]; then
        vagrant destroy --force
        exit 1
    fi
done
subnetmarco commented 10 years ago

+1

francoisjacques commented 10 years ago

+1

jayunit100 commented 10 years ago

Are we sure that the above solution works on all AMIs? I found that after 15 tries, the retry still did not work.

I think that retrying might not always fix this issue.
The solution in here : https://github.com/mitchellh/vagrant-aws/pull/70/files works much better for me and is simpler to implement (just seed the init file with the TTY modification). I realize its a bit of a hack, but it is synchronous, and I think that an important feature for alot of people.

josh-padnick commented 9 years ago

I tried the official solution linked to above and while it does work on vagrant reload, it consistently fails on the first run. See logs below. If I discover a better way to do this that's not too hackish, I'll gladly share it here. In the meantime, has anyone else gotten this to run successfully on the first run?

Log files from Vagrant

=> default: Launching an instance with the following settings...
==> default:  -- Type: t2.micro
==> default:  -- AMI: ami-d13845e1
==> default:  -- Region: us-west-2
==> default:  -- Keypair: **********
==> default:  -- Subnet ID: **********
==> default:  -- User Data: yes
==> default:  -- Security Groups: ["**********"]
==> default:  -- User Data:     #!/bin/bash
==> default:     echo 'Defaults:ec2-user !requiretty' > /etc/sudoers.d/999-vagrant-cloud-init-requiretty     && chmod 440 /etc/sudoers.d/999-vagrant-cloud-init-requiretty
==> default:  -- Block Device Mapping: []
==> default:  -- Terminate On Shutdown: false
==> default:  -- Monitoring: false
==> default:  -- EBS optimized: false
==> default:  -- Assigning a public IP address in a VPC: false
==> default: Waiting for instance to become "ready"...
==> default: Waiting for SSH to become available...
==> default: Machine is booted and ready for use!
==> default: Rsyncing folder: /.../code/ => /vagrant
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!

mkdir -p '/vagrant'

Stdout from the command:

Stderr from the command:

sudo: sorry, you must have a tty to run sudo
immo-huneke-zuhlke commented 8 years ago

Hi miguno, (commented on Jul 2, 2013) - your boothook.sh approach worked beautifully for me on the Centos 7 AMI. Many thanks! I just used the .sh script without the complication of multipart-mime, as there is only one file to upload.