Closed sebastian-alfers closed 12 years ago
I'm having the same problem with lucid32+64. GUI workaround with network restart works.
vagrant halt
returns an error and vagrant up
hangs. I've built a box from scratch and can verify it works. However when I try to create an new instance of the box I have issues with the above commands.
On vagrant up
my console spits out the following:
[default] Clearing any previously set forwarded ports...
[default] Forwarding ports...
[default] -- 22 => 2222 (adapter 1)
[default] Creating shared folders metadata...
[default] Clearing any previously set network interfaces...
[default] Booting VM...
[default] Waiting for VM to boot. This can take a few minutes.
When I CTRL-C out of vagrant up
and then do vagrant ssh
I can enter my box. Even though the command is hanging, I can see that the VM is running from VirtualBox.
When I exit the guest and run vagrant halt
, I get:
The following SSH command responded with a non-zero exit status.
Vagrant assumes that this means the command failed!
shutdown -h now
The box can be halted running vagrant suspend
then vagrant halt
-weird.
Running VirtualBox 4.1.14r7740 and vagrant 1.0.2
Thanks for any assistance you can provide.
the veewee tool has a handy set of tests to check you've set everything up right. It's under the valdate command.
@leth I can confirm that I am experiencing a similar result from building it from scratch (previous post). While using veewee 0.3.alpha9 to build a VM from ubuntu-12.04-amd64
template, I can't ssh into the box.
I waited less than 5m for the VM to boot.
[default] Failed to connect to VM!
Failed to connect to VM via SSH. Please verify the VM successfully booted
by looking at the VirtualBox GUI.
It is running in VirtualBox.
I've been using this specific configuration in my Vagrantfile
for some time and it works perfectly on my Macbook under OS X Lion 10.7.3, VirtualBox 4.1.14r77440 (from VBoxManage -v
) while it was not starting up correctly more than 2 times out of 3.
First be sure there is no conflict between your box's network and any other active virtual machine. I use hostonly networks and ensure different networks are used for each machine I configure:
config.vm.network :hostonly, "10.10.10.2"
This is the trick found above in this thread:
config.vm.customize ["modifyvm", :id, "--rtcuseutc", "on"]
Should it still does not work, I want it to notify me faster, so I reduce the number of tries for SSH:
config.ssh.max_tries = 10
Hope it will help!
@rchampourlier thanks for the tip. I added those to my Vagrantfile and still no luck, I updated my post above with the output. I am going to review issue #14
Hi everybody, as the issue is over one year old, could anybody define/suggest a common test case that can be run in order to pin point the issue? I am experiencing similar problems and I would like to provide help in debugging and/or trying different configurations.
I ran into this issue recently, with Fedora 13 only, where Fedora 16 does not show this issue. It is network related, since when I log in using the GUI, eth0 is not active.
I have disabled NetworkManager and set NM_CONTROLLED="no" in my ifcfg-eth0 file. This was a defect according to Redhat https://bugzilla.redhat.com/show_bug.cgi?id=597515, which is no longer maintained.
So I can agree, this goes back to an issue with bringing up the interfaces, it has nothing to do with SSH, or maybe other edge cases are present...
What distribution are you running? and what does "dmesg" show if you login using the GUI? and if you login using the gui and do /etc/init.d/network restart, what happens?
Hi, I did what @mikhailov suggests in #issuecomment-2078383 (restart the networking service in rc.local) and it worked for me
So, what I found is: I am using https://github.com/erivello/sf2-vagrant and the lucid32 distributed by Vagrant. I am trying the same exact configuration in 2 identical iMac's at my company: same hardware, same OS X version 10.6.8, same virtualbox version (4.1.16), same vagrant version (1.0.3). In one of the iMac's the machine boots up just fine, in the other one it lags at the SSH connection.
This makes me think it's something different in the host environment or in the interaction between host and vm.
I also tried to do a complete reinstall virtualbox and deleting the ~/.vagrant.d folder to start fresh, but I still get the error.
EDIT: I retried after a few days and now it's working: probably a host reboot fixed the problem? Or this is something random.
just got this one twice already today using the vagrant lucid32 basebox. I also got it on the first boot with this vm just after the first "vagrant up" with virtualbox 4.1.10
I've used this crontab entry as a workaround:
@reboot /sbin/ifdown eth0 ; /bin/sleep 5 ; /sbin/ifup eth0
@schmurfy, thanks for your work on Goliath, hope the following helps bring you up to speed on the state of play...
Somewhere in this, or related issue threads, is my finding that this is caused by ssh trying to completed its handshake while the OS is not ready for this, e.g. cycles spent in motd. The ssh connection is made, just not completed.
There are several ways to mitigate this, no motd, restart network services, bring the iface down then up etc.
There is no solution just workarounds, and a Google project (which I can't recall right now - search my nickname+vagrant in their issues) is having this same issue, also in a VM booting context.
Bootstrapping via VB level commands were investigated by Mitchell and weren't feasible due to VB issues. Bootstrapping over serial console likewise was suggested but not completed for good reasons that escape my memory right now.
HTH
@hedgehog with the removal of the router part I am not sure a lot of my code remains in goliath xD
Given that some solutions exists as proposed here it would be nice if the base image came with it, I think I will try creating my own base image with one of the proposed fix, thanks.
+1 for creating base images with any working workaround.
My box CentOS 6.2 32bit, stalled when vagrant up at "Waiting for VM to boot. This can take a few minutes.". SSH to box work. This happen when host not connected to wifi/internet. So as workaround i disabled the firewall at guest box and it work, also check host firewall.
I recently ran into the same problem. I'm using vagrant 1.0.3, VirtualBox 4.1.18 and the standard lucid32 box. This workaround from @xmartinez worked for me:
config.vm.customize ["modifyvm", :id, "--rtcuseutc", "on"]
It is easier to debug when you can boot the machine with the GUI. I've experienced this issue on a local Mac machine where it is easy to boot the GUI but I also experienced it on a remote Debian server where I had to install X11 and then do X11 forwarding to get to boot the GUI on the local Mac and then debug it and turn the config back to no GUI.
If you are using the latest ubuntu or variants like mint, there has been changes to how ubuntu handles dns. Try running
sudo dpkg-reconfigure resolvconf
And using the first option to create a symlink to /etc/resolv.conf. Virtualbox needs this file to set the DNS correctly for NAT. This should be made more obvious.
Doing this fixed the problems for me, I didn't have to set ssh timeouts, restart networking or use --rtcuseutc
In headless mode with eth1 as host-only networking, I also get a hanging vagrant waiting for the ssh port forward to connect. I can ssh to eth1 fine so I think this is a problem with port forwarding or NAT eth0. Hard to test because I can't ssh to eth0 directly from OSX.
To fix, a simple "ifdown eth0; ifup eth0". Suspect it's some timing around eth0, vboxservice loading, port mapped.
The ifdown eth0 has this error from dhcp:
DHCPRELEASE on eth0 to 10.0.2.2 port 67 send_packet: Network is unreachable send_packet: please consult README file regarding broadcast address.
After an ifup, further ifdown are successful.
Don't even need the ifup/ifdown, a "dhclient eth0" will let vagrant resume.
I've been reloading my vagrant over and over for an hour, each reload takes 90 seconds. No hangs.
I don't use the "pre-up sleep 2" or any of the workarounds in this thread.
In rc.local on Ubuntu, right before the exit 0, I put "dhclient eth0". This won't disturb the network, it'll just kick eth0 in the butt and get it working again. Since it runs last, I hope it avoids whatever it is that is hanging the ifup during network init, because that's what I saw for both eth0 NAT and eth1 host-only interfaces on my guests -- ifup still running, their child processes blocked.
I try to restart networking service on boot but, for some reasons, I can't access to webserver. So I've to restart to times networking service and it works. But I can't stop the vm with "vagrant halt" (I've before to run "vagrant suspend") and I can't access to ssh with "vagrant ssh" (I've to use "ssh vagrant@IP).
starting the VMs in GUI mode and then executing "sudo dhclient eth0" resumed vagrant for me, too.
@destructuring Awesome! I'm going to put this into the base boxes I release and hope this solves this issue. :)
I just uploaded new lucid32/lucid64/precise32/precise64 with @destructuring's changes. Let me know if this is gone!
None of these solutions are working for me. The only thing that does is to rebuild the box. I noticed only one other person has commented on @destructuring's solution working for them. Can I get a sanity check?
I get
RTNETLINK answers: File exists
when running sudo dhclient eth0
.
I'm not sure this is the proper forum but I've been running into similar problems with hangs on 'vagrant up'
I'm posting here because I'm seeing postings that indicate different behavior in different environments, which is what I ran into and there seem to be multiple tickets tied to the same core issue. This seemed as good a spot as any :) The solution seems to be outside vagrant.
If you are behind a proxy (likely to be the case at work but not at home) you will need to configure the guest system with your proxy settings. Setting http_proxy and https_proxy environment variables in /etc/bashrc worked for me, made them system wide and available for the ssh access required by vagrant. If you do not specify the proxy you will receive the dreaded ifup message and your boot will hang.
The caveat here is that if you set this and try to boot while you are not behind the configured proxy you will receive the same message and hang on boot.
For me this issue is not closed. I found a workflow to reproduce it. Please read (and edit/comment) https://github.com/mitchellh/vagrant/wiki/%60vagrant-up%60-hangs-at-%22Waiting-for-VM-to-boot.-This-can-take-a-few-minutes%22
I'm also having the hang at "[default] Waiting for VM to boot. This can take a few minutes." but I've somewhat figured out the cause. The DNS proxying is not working causing ssh connections to take 10 seconds to be established. This cause the probe to timeout. vagrant ssh
and other commands seem to have a longer timeout and they run OK.
Some base boxes also boot OK because they do not have UseDNS yes
in /etc/ssh/sshd_config
and don't run into this problem at all.
For me restarting networking does not work.. it seems the dns proxy stuff just doesn't work on the version of vagrant in ubuntu 12.10 (1.0.3) with virtualbox 4.1.18
ah, somewhat figured it out:
my resolv.conf
has
nameserver 127.0.1.1
The code in vagrant only checks for 127.0.0.1 when disabling the DNS proxy. That said, I fixed the regex but dns still doesn't work in the VM. It'll work fine if I change the DNS server to 192.168.1.1 or 8.8.8.8, so it's not completely broken, something is just breaking the autoconfiguration.
I've been having success with /etc/init.d/networking restart
in /etc/rc.local
.
I'm not sure why restarting networking works for some people, it doesn't work here.
It looks like HEAD already has the fix for 127.0.1.1 issue, so that's good..
as for the other issue, looking here: https://bugs.launchpad.net/ubuntu/+source/virtualbox/+bug/1031217 the fix for the issue is stated to be to turn natdnshostresolver1
on, but the code in vagrant that is linked from that bug turns it off. I'm not sure why there is a discrepancy, but this probably has something to do with my problem.
I have just retried with a freshly downloaded official lucid32 and on a remote debian and it works fine without doing anything special.
For me this issue was in the dns configuration.. setting: VBoxManage modifyvm "puppet-playground_1357644642" --natdnshostresolver1 on
fixed this for me.
Sometimes GRUB starts in failsafe mode (when box downs in Ubuntu by example) and sets a grub timeout of -1.
Fix:
Edit /etc/grub.d/00_header, and find:
if [ "\${recordfail}" = 1 ]; then
set timeout=-1
Change it to...
if [ "\${recordfail}" = 1 ]; then
set timeout=10
config.vm.customize ["modifyvm", :id, "--rtcuseutc", "on"]
works for me, but only if I included it before configuring the vm network:
config.vm.network :hostonly, "10.10.10.2"
So, I've found a completely unrelated cause of these circumstances. I doubt many people are having the stupid problem I was, but, for the record, if your brand new laptop has VX turned off, then when the interior VM can't start, the way it manifests from the outside is the SSH server being unwilling to accept connections (because it's not running.)
And so you end up with the repeated attempts to hit 2222, all failing.
And you really can't tell the difference, from the outside, against any of these other causes.
The way to test if you've got my problem is just to run the VM directly from the VB manager. If you get a message talking about how you can't boot without VT-X/AMD-V, then, well, ha ha.
Older machines, go into the BIOS and turn it on.
Newer machines, UEFI gets in your way. From Win8, go to the start screen, and type bios. It'll say that no apps match your search, but if you look, one setting does. Hit settings - you'll see "advanced startup options." Go in there, and under the general tab, go to the bottom, where there's a button "restart now" under the heading "advanced startup."
When you hit that, it doesn't actually restart now; it brings up another menu, one item of which allows you to get at your bios. Follow that, and you'll get in.
Then go turn on whatever your BIOS calls your hypervisor hardware. (There's like six names for it, but it's usually VT-X or AMD-V.) Enable, save, and shut down.
On reboot, vagrant will be happy again.
adding
ifdown eth0
sleep 1
ifup eth0
exit 0
to /etc/rc.local
solved it. dhclient eth0
solves it too.
A weird thing is that when I build by base box image, doing apt-get install dkms
before installing VirtualBox additions made it work 100% afterwards.
I've run into the same frustrating issue while building a CentOS base box. What completely fixed it for me was to add dhclient eth0
to /etc/rc.local
as suggested by @keo above. I wonder if this is something that Vagrant itself could help with, by systematically kicking eth0
on startup...
I have the same issue with CentOS 6.3.
My suspicion is that the 10.0.2.2 gateway actually EXISTS on our network:
10.0.2.0 * 255.255.255.0 U 0 0 0 eth0 link-local * 255.255.0.0 U 1002 0 0 eth0 default 10.0.2.2 0.0.0.0 UG 0 0 0 eth0
So if my networking is going through some poor random server, no wonder it takes forever for the packets to go through.
I will try to figure out how to set up the networking differently.
Edit: I resolved my issue. I needed to reconfigure the network VirtualBox uses for DHCP.
http://stackoverflow.com/questions/15512073/set-up-dhcp-server-ip-for-vagrant
Added following code:
config.vm.provider :virtualbox do |vb|
vb.customize ["modifyvm", :id, "--natnet1", "192.168/16"]
end
You can check for this issue easily - even before you start Vagrant, ping 10.0.2.2 - if you get a response, you are in trouble.
To anybody else trying to make their way through this, if you're trying one of the suggested workarounds:
config.vm.customize ["modifyvm", :id, "--rtcuseutc", "on"]
this does not work if your Vagrantfile is version 2 and starts with:
Vagrant.configure("2") do |config|
You'll get errors like:
$ vagrant destroy
/Applications/Vagrant/embedded/gems/gems/vagrant-1.1.2/plugins/kernel_v2/config/vm.rb:147:in `provider': wrong number of arguments (0 for 1) (ArgumentError)
from /Users/jamshid/tmp/70/Vagrantfile:13:in `block in <top (required)>'
from /Applications/Vagrant/embedded/gems/gems/vagrant-1.1.2/lib/vagrant/config/v2/loader.rb:37:in `call'
...
from /Applications/Vagrant/bin/../embedded/gems/bin/vagrant:23:in `<main>'
Instead use:
config.vm.provider :virtualbox do |vb|
vb.customize ["modifyvm", :id, "--rtcuseutc", "on"]
end
Does anyone have this issue anymore without a CentOS machine? CentOS the issue is most likely forgetting to remove the udev rules. But anyone getting this with the precise64 box or some Ubuntu based box or some box where they KNOW they cleared the udev rules?
I found that after running yum groupinstall Desktop
and rebooting a CentOS 6.4 guest, the VM could not communicate with the outside world at all. The fix for me was to disable NetworkManager and restart networking.
chkconfig NetworkManager off
service network restart
For me, it ended up being that I had to make sure "Cable connected" was checked under the adapter settings in VirtualBox.
For a problem with the same symptoms, but a different cause and solution, see https://github.com/mitchellh/vagrant/issues/1792
Just a public +1 and thanks to @jphalip whose tip fixed things up for me on a centos vm
Hey,
i can not log into my VM after "vagrant up"
i have to start it in gui-mode, then retart my network adapter "sudo /etc/init.d/networking restart" after this, my VM gets an ip (v4) address and my mac is able to ssh the VM and do the provisioning.
any idea on this?
same isse as here: http://groups.google.com/group/vagrant-up/browse_frm/thread/e951417f59e74b9c
the box is about 5 days old!
Thank you! Seb